I’ve been trying to apply the Loregic algorithm in other organisms in order to further validate the method, however I’m finding some inconsistencies that could be related to data manipulation (choosing datasets, merging and mean-centering samples).
Furthermore, I’ve also found those inconsistencies when trying to reproduce the analysis from yeast datasets provided in your publication (probably due to the same data manipulation issues described before).
Would you be able to provide a more in-depth protocol for using Loregic with multiple datasets (how you handled the data, for example) in order to improve the consistency of the method between labs?
Yes, we normalized the yeast data. Here was how we preprocessed:
1) got time-series yeast cell cycle data (alpha, cdc15, cdc28) from
which were logarithm values.
2) standardized(2^(data)) s.t., each time point has mean=0, and sigma=1
3) binarized the standardized data using the function,
binarizeTimeSeries with ‘kmeans’ clustering in R package BoolNet.
I am writing to ask if you could kindly share with me the yeast cell cycle binarized expression data that you used in Loregic’s paper.
In our group we would like to find a method to identify the logic rules that govern cooperativity of multiple regulators, in GRNs built from differentially expressed genes.
The amount of samples we will have is limited, so we will be mainly relying on literature information, and as a first step we would like to test our method on your binarized expression data.
We used BoolNet to binarize data,
http://cran.r-project.org/web/packages/BoolNet/index.html . We also
http://cran.r-project.org/web/packages/ArrayBin/index.html, which gave
very similar Loregic results with BoolNet (see Supplemental Figure).
The yeast cell cycle data we used was the classical microarray data
published in 1998 (Spellman & Cho):
I would like to apply the bulk-tissue deconvolution algorithm in your recent paper (Wang et al., 2018) using our own single cell RNA-Seq data and Gandal et al., 2018’s bulk tissue RNA-Seq. I couldn’t find code related to the deconvolution steps in the Gernsetin Lab github page (https://github.com/gersteinlab/PsychENCODE-DSPN) or on the PsychEncode resources page. I only found results to the cell fraction calculations. Would you be able to point me towards how I can apply this algorithm?
We used non-negative least square method for deconvolution and implemented it using R function nnls (https://www.rdocumentation.org/packages/lsei/versions/1.2-0/topics/nnls) For example nnls(C, bi) estimates the cell fractions for ith tissue sample, where C is cell type gene expression matrix (row: gene, column: cell type), and bi is the gene expression vector for ith tissue sample.