I am currently running HiC-spector on mouse genome datasets with bin size 5kb. I noticed that it requires quite a lot of memory, so I was wondering if there were tests done on HiC-spector’s space complexity, as I couldn’t find such studies in the Supplementary Data.
We didn’t do analysis explicitly. Because the contact maps are stored as sparse matrices, the memory won’t grow quadratically. In general, if calculation is done chromosome by chromosome, 5kb should be fine.
Split-Read Identification Calibration (SRIC) download
I am interested in running the SRIC software to identify variants, however, I cannot find a download link or github page about it. Would you please be so kind as to provide me with a link?
unfortunately, we didn’t make a software package for SRiC
Your postdoc give a great talk about the EN-TEX work in the ASHG meeting. The data
generated from this project will benefit the community greatly. Could you
please tell when and how the data will be made available for external users?
Thank you for your suggestion. In the mean time, you can find the correct versions of fasta and blast freely available online. For easing the user experience we provide a link to the two packages on the website http://pseudogene.org/pseudopipe/ .
I am writing this e-mail to inquire about STRESS software.
We have learned from your paper (Structure 2016,24:826-837)
that STRESS software can be used for identifying allosteric pockets.
We are interested in using the software for our drug discovery research.
We will perform evaluation of the software for a start.
Will you allow us to use STRESS software for the purpose of our
commercial drug discovery project free of charge?
As this is an urgent project, we would highly appreciate if you could
see license at https://sites.gersteinlab.org/permissions/
We have read with much interest your article about the HiC-Spector method.
We are currently working on a method that we hope will help identify
conserved features across different HiC-maps. As the problem we are studying
and the one tackled in your article are closely related, we think it would
be useful for us to test our method using your data set as the ground truth.
We kindly ask whether you would be able to provide us with the HiC maps used
in the article for this purpose.
We’ve just been reading your excellent papillary RCC WGS paper- there is a
real paucity of data on papillary cases, so many thanks for this.
Sorry if I missed it, but do you happen to know the SNV and (small scale)
indel counts across the cohort? We’re especially interested in indel
mutations in RCC, and wandered what proportion of your variants were of this
For tumor SNV counts, you can find them in the supplemental table (https://doi.org/10.1371/journal.pgen.1006685.s009). We also include SVs in the supplements too. Unfortunately, we do not have indels for those tumors.
I’ve been trying to apply the Loregic algorithm in other organisms in order to further validate the method, however I’m finding some inconsistencies that could be related to data manipulation (choosing datasets, merging and mean-centering samples).
Furthermore, I’ve also found those inconsistencies when trying to reproduce the analysis from yeast datasets provided in your publication (probably due to the same data manipulation issues described before).
Would you be able to provide a more in-depth protocol for using Loregic with multiple datasets (how you handled the data, for example) in order to improve the consistency of the method between labs?
Yes, we normalized the yeast data. Here was how we preprocessed:
1) got time-series yeast cell cycle data (alpha, cdc15, cdc28) from
which were logarithm values.
2) standardized(2^(data)) s.t., each time point has mean=0, and sigma=1
3) binarized the standardized data using the function,
binarizeTimeSeries with ‘kmeans’ clustering in R package BoolNet.