ra | Gerstein Lab FAQs

Q:
I read your paper about the co-associations among TF binding events, (Architecture of the human regulatory network derived from ENCODE data), and got interested in your original clustering algorithm. Now, in our laboratory, we are developing a new clustering algorithm for a large number of genomic data, and implemented its prototype algorithm. However, the accuracy of our algorithm is not so completed, and we have to evaluate it. Thus, we want to use your algorithm as the fine basis, so how can we use it? If the program is available for us, can you tell us the way to use it?

A:
In that paper we used the Rulefit3 package from Prof. Jerome Friedman; there is an R package available at the link below. Our use of the algorithm is extensively documented in Section C of the Supplementary Materials.

Rulefit3
http://dx.doi.org/10.1214/07-Aoas148
http://www-stat.stanford.edu/~jhf/r-rulefit/rulefit3/R_RuleFit3.html

Architecture of the human regulatory network derived from ENCODE data http://dx.doi.org/10.1038/Nature11245

With regards to the paper published in Nature, Architecture of the human regulatory network derived from ENCODE data, I have been perusing the Supplementary Information and find that reference No. 69 seems, to the best of my belief, to have been mapped incorrectly. I would like to provide a quote which, in my understanding, promises a reference to a RuleFit3 manuscript but instead corresponds to a paper concerning Transcriptional Regulation in Mast Cells:

The number of rules is not set a priori but is rather learned from the data itself. Details are provided in the RuleFit3 manuscript69. -P. 14/271

69 Bockamp, E. O. et al. Transcriptional regulation of the stem cell leukemia gene by PU.1 and Elf-1. J. Biol. Chem. 273, 29032-29042 (1998).

It turns out that references 69-71 in section C2 of the supplementary material were not correctly added to the reference list. References 69-71 in later sections refer to the correct articles. Below are the correct citations for refs 69-71 in section C2 of the supplement.

Rulefit3 (ref 69)
Frieman, J. H. & Popescu, B. E. Predictive Learning Via Rule Ensembles. Annals Applied Stat. 2, 916-954, doi:10.1214/07-Aoas148 (2008).
http://dx.doi.org/10.1214/07-Aoas148

the well-known random forest algorithm (ref 70)
Breiman, L. Random forests. Mach Learn 45, 5-32, doi:10.1023/A:1010933404324 (2001). http://dx.doi.org/10.1023/A:1010933404324

the GREAT Functional Annotation server (ref 71)
McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nature Biotechnology 28, 495-U155, doi:10.1038/nbt.1630 (2010). http://dx.doi.org/10.1038/nbt.1630
http://great.stanford.edu/

Gerstein Lab FAQs

Frequently Asked Questions

Tag Archives: ra

rulefit3 in encodenets

missing citations in encodenets supplement