Code for random forest models in paper “Comparative analysis of the transcriptome across distant species”

I read your recent letter in Nature ("Comparative analysis of the transcriptome across
distant species") and would like to use your strategy to model and predict gene expression profiles using modified histone ChIP-seq data in Eucalyptus.

Currently we have RNA-seq data for 7 tissues and some total, stranded and small RNA-seq data for selected tissues. I’m busy generating ChIP-seq profiles of 5 histone modifications in two tissues, and I’d like to see to what degree we can predict mRNA-seq data from these. We also have DNase-seq and TF ChIP-seq experiments planned in future.

I was wondering whether you have any workflows or scripts that you would be willing to share with us that would help us to better understand how the randomForest package was used for the modeling (I don’t have a programing background but we have an able bioinformatics unit). Alternatively, it would be a pleasure to collaborate on a publication with your lab if your team could assist us with the modeling aspect.

There’s some scripts associated with:

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s