OrthoClust – for more than two species

I just read your recently published paper on OrthoClust approach. It is a well grounded work in both practically and mathematically point of views.

I ran your R scripts for my own data and It worked perfectly fine, however I am wondering how can I use the script for more than two species?

It could be appreciated if you help me to find the solution.

Thanks for your interest in OrthoClust. Orthoclust definitely works on more than 2. The R script is a primitive version for illustrating the concept outlined in the paper. We understand the importance of N-species generalization. We have put a new MATLAB code for N-species. It made use of an efficient code written by Mucha and Porter that implemented the Louvain algorithm for modularity optimization. The 3rd party code as well as our wrapper is now in the gersteinlab github.
Apart from MATLAB, we are planning to provide wrapper for Python or R later.
The N-species code is not exactly the thing we did for the paper. So if you find any bug or question, please let me know. we are trying to make a more user friendly package anyway.

Data associated w/paper “Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data”

I recently read your article “Construction
and Analysis of an Integrated Regulatory Network Derived from
High-Throughput Sequencing Data”. In the last year, I measured mRNA and
miRNA expression in the different types of mouse skeletal muscle fibers to
discover the different regulatory circuits activated in fast and slow
myofibers. I designed a preliminary network using the databases of miRNA –
target mRNA and protein – protein interactions, and I have started to
include my expression data in order to understand the biological meaning. I
was wondering if it is possible to use your more accurate mouse regulatory
network for my data. Is this network free to use? In the article and in the
website of your laboratory I did not find any file or link with the complete
networks that you describe. I am not a computational biologist, but the
paper is very interesting and I think that the network that you design with
your method could be very useful for the scientific community.

Hereby I attach three files for our three mouse networks. 1) how miRNAs targeting genes (This is not our calculation, but downloaded from TargetScan).
2) how TFs targeting genes, 3) how TFs targeting miRNAs based on ChIP-Seq data of 12 TFs.
The files are in plain text format. The first column is the list of regulators and the second column is the list of targets. The bracket next to a gene name gives the class of the gene, TF for transcription factors, MIR for miRNAs, and X for non-TF protein-coding genes.
Thank you for your interest of our paper. I hope this information will be useful for your work.

Question re ENCODE data on website


I’ve been incorporating the encode data from your webpage in my analyzes
(http://encodenets.gersteinlab.org/). The data is fantastic, but I have
questions regarding the enets*.GM_proximal_*filtered_network.txt data

The filtered dataset actually contains more regulators than the
unfiltered data
set, making me speculate that the unfiltered data file is not complete:
[bb447@compute-8-2 TF]$ cut -f1
enets6.GM_proximal_unfiltered_network.txt | sort
-u | wc -l
[bb447@compute-8-2 TF]$ cut -f1 enets8.GM_proximal_filtered_network.txt
| sort
-u | wc -l

Could it be possible that the file is incomplete?

the updated files are uploaded to the site. thanks again for pointing this out.

Data re “Architecture of the human regulatory network derived from ENCODE data”

I am very familiar with the ENCODE TF datasets, as I’ve been applying it to various problems in my PhD. I was interested in the expression analysis across human tissues for the ((miR –> TF) –> targets) FFL. There is a reference in the Supplementary file (section H) to the protein-coding expression atlas Su et al. 2004, for the TF and protein-coding targets in this loop, but doesn’t seem to be a ref for the corresponding expression data for miRNAs? I assume it would be Landgraf et al. 2007 ‘A mammalian microRNA expression atlas based on small RNA library sequencing’, since this allows matched tissues and samples with Su et al. However, it might be some other dataset. It would be helpful to be able to replicate/extend the FFL analysis using the correct data. Would you be able to forward this email to the relevent person(s) to confirm whether microRNA expression was taken from Landgraf atlas? Many thanks for your help

Slight correction: The FFL studied for expression pattern of
components is the other way round: ((TF –> miR) –> targets).

the miRNA expression is actually from
Lu et al, Nature 2005

if you go to
under the heading "MicroRNA Expression Profiles Classify Human Cancers"
see files