ENCODE-Networks Source Code for Context-Specific TF Co-Association Analyses

Q:
Hello,
I am interested in your paper published in Nature, 06 September 2012, “Architecture of the human regulatory network derived from ENCODE data”. In particular, we are interested in the framework of context-specific TF co-association analysis described in this paper. We would like to apply this method on our in-house datasets. It’s exciting that the code for these analyses is “Available soon” (the file “enets21.coassoc-code.tgz” on http://encodenets.gersteinlab.org/). Do you know whether the code for co-association analysis in this paper is available now? If so, it might save us a lot of time. Thanks for your help!

A:
The main machine learning method used for the analysis is RuleFit3 which is available here
http://statweb.stanford.edu/~jhf/r-rulefit/rulefit3/R_RuleFit3.html

Detailed instructions on preparing the input data and computing the various scores are in the supplement of the paper.

I don’t have a polished code package that is ready for use for the general public. The code that I wrote for analyses in the paper is here https://code.google.com/p/tf-coassociation/source/browse/#svn%2Ftrunk%2Fscripts . But I have to warn you that its not designed to work on general datasets as it has scripts that were designed to run on our local cluster. The core functions are in
https://code.google.com/p/tf-coassociation/source/browse/trunk/scripts/assoc.matrix.utils.R . The code is reasonably commented so hopefully it should help.

Multinet (Unified global network) – academic use

Q:
I read your seminal paper “Interpretation of Genomic Variants Using a Unified Biological Network Approach” recently published in PLoS Computational

Biology. I have a few queries:
Is the network available for academic use?
Can we download the relevant multinet to form hypothesis and do

experiments?

A:
Please find the downloadable network at
http://homes.gersteinlab.org/Khurana-PLoSCompBio-2013/
Posted in Uncategorized | Tagged ek | Leave a reply

Interaction Data set

Q:

help of your paper “Redefining Nodes and Edges: Relating 3D Structures to Protein Networks Provides Insight
into their Evolution “. Now I need to get those protein in pfam which are involved in interaction and also the crystal structure of them.
I would be very grateful to you if you send me the link to access the more detail format of SIN v0.9 data.

A:

My understanding from your email is that you would like to know the Pfam IDs
and the corresponding crystal structures (ie, the PDB IDs) for the
interactions involved in the SIN. To do this, you will have to process two
separate datasets together, but this will not be difficult. Here are the
steps:

i) access the raw SIN data (http://networks.gersteinlab.org/structint/) At
this page, click on “composite dataset” under the download column for SIN
v0.9 data. This is a list of open reading frame IDs corresponding to each
interaction (the first and third columns), as well as whether the
interaction is taken from Pfam.
ii) open the text file I’ve attached with this email. Each row contains
several pieces of information, but what you would like to do is find the PDB
IDs (contained in the 2nd column) corresponding to each Ensembl Gene ID (the
first column). This Ensembl Gene ID is taken from (i) above.
I should mention that there are two problems with the procedure outlined
above.
The first is that I noticed it will not provide crystal structures for all
interactions. I’m not sure why this is the case. Secondly, for some
interactions, multiple crystal structures are available, and it is not clear
which structure was used in Pfam. Nitin (CC’ed to this email) may know how
to negotiate with these issues. If you are still having difficulty, please
contact Nitin or I again after further efforts to get the data you need.

integrated regulatory network

Q:

I read your recent paper “Construction and Analysis of an Integrated
Regulatory Network Derived from High-Throughput Sequencing Data” in PLOS
Computational Biology with a great interest. I would like to know if the
data of your integrated regulatory networks is available, or if you mind to
share it. Indeed, I’m part of a group of statisticians in Evry (France)
working on probabilistic models for biological networks. Our aim is to
retrieve the groups of nodes having similar topological behaviours. The
fact that your data has three types of nodes, a hierarchical structure among
TFs and miRNAs and that you made a biological analysis of this structure
makes it very interesting for us to validate or not the methods we
developed. Would it be possible for you to send me the C. elegans network
and the corresponding hierarchical structure? Any use of it would of course
be referenced.

A:

I have upload the worm network data onto http://archive.gersteinlab.org/proj/mirnet
It comprise 3 files:

cel_TF_Target_GID.net : TF->gene interactions
cel_TF_MIR_GID.net: TF->miR interactions
cel_miR_conservedTarget_Kris3way_GID.net: miR->gene interaction

Node type is labeled as “MIR”, “TF” or “X” in the bracket.