Question about ‘Using Ethereum blockchain to store and query pharmacogenomics data via smart contracts’

Q:
I’ve read your paper, Using Ethereum blockchain to store and query pharmacogenomics data via smart contracts, published in BMC Medical Genomics. And I’m very, very interested in your program and result. Could you share your program with me? I’m new in blockchain field and a standalone researcher, so I hope you may share the program and give me some advice.

A:
GitHub link is on the paper and you can find below too

https://github.com/gersteinlab/idash19bc

It’s freely available to everyone so you can share with anyone if you like

Peak-call table?

Q:
I enjoyed reading your recent STARRPeaker paper. However, when I tried to download Additional file 2: Supplementary Table S1 to get the peak calls, it returned the same PDF of the figure supplement as Additional file 1. I’d be grateful if you or a colleague would send me the table or point to where I might download it.

A:
Please see the Supplementary Table 1 file. We are working with Genome Biology to get it corrected. FYI, you can also download BED format peaks (same as Suppl. table S1) from the ENCODE project website (http://bit.ly/whg-starr-seq).

STARRPeaker publication in Genome Biology — missing Supplemental Table 1

Q:
I recently read your STARRPeaker publication in Genome Biology. The STARR-seq technology is very interesting to me, and I thoroughly enjoyed your paper. In considering my own experiments, I was looking to see through the supplemental data and noticed that link for the supplemental table 1 links to a recapitulation of the PDF of the supplemental figures (S1-S13; it’s a new file but contains the same 13 supplemental figures). Thus, Supplemental Table 1 is completely missing from the Genome Biology site. Could you or one of your co-authors forward a copy of that table to me?

A:
We will contact Genome Biology to get it corrected.

The real cost of sequencing: higher than you think!

Q:
In support of an editorial comment I’m working on with a colleague, I just read your paper: The real cost of sequencing: higher than you think!

It is right on the money…as far as the topic I’m interested in but is, obviously, dated (but was helpful for me nonetheless). Pubmed didn’t bring up a more recent revision of your paper…so I assume it hasn’t been updated.

Can you recommend any other, more recent papers or resources that address the issue of cost/price of sequencing? The NHGRI webpage on this topic is also helpful, but it was written in 2016 and does not address the post sequencing costs such as variant calling, annotation, and interpretation. Do you think your illustration of the proportion of costs in 2020 due to the different components of testing still holds true?

A:
see
http://papers.gersteinlab.org/papers/costseq2
http://papers.gersteinlab.org/papers/dsg

Query Regarding GRAM eQTL & MTC

Q:
I am contacting you as the corresponding author for the paper: "GRAM: A generalized model to predict the molecular effect of a non-coding variant in a cell-type specific manner." PLoS genetics 15.8 (2019): e1007860.

I would like to express my thanks to you and your group for developing & publishing GRAM. I have recently tested it out and the results have been most interesting.

I have begun to work with eQTL analysis only recently and as a result, I was wondering what you would recommend as a multiple testing correction method for GRAM score based eQTL analysis?

From the literature I have seen that standard multiple testing correction methods such as Bonferroni & Benjamini-Hochberg have be considered too conservative for regular eQTL analysis as they do not take linkage disequilibrium into account, and several permutation testing based approaches have been published specifically for eQTL as a result (e.g. eigenMT). However, as you have demonstrated GRAM score based eQTL to be able to differentiate the regulatory effects of variants in linkage disequilibrium, I am unsure whether such methods would be appropriate here.

A:
One of the application scenarios of using GRAM is fine-mapping, which suppose that you have a list of eQTL and its LD associated mutations. If you don’t have eQTL and want to try it on eQTL identification, maybe one way is you compare the gram score with a normally distributed background (use tens of thousands of background/random selected mutations) and infer a p-value of the GRAM score of a variant relative to the background, then use BH or FDR method to do the multi-testing correction.

Frankly speaking, this is a very great point to extend our GRAM. We may also consider testing this recently. The most computation-intensive part of this to calculate deepbind score for background variants, which will take a long time if we want to test millions of background variants. If you have any feedback, further questions or preliminary findings regarding this, please feel free to let us know.

Small question of the paper “Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences”

Q:
Recently, I read a paper which was published in Cell, titled "Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences". Cause of my research topic was similar with this paper, just one of question about Figure 2B. In this heatmap, I saw totally 80 motifs on the bottom, but only 70 rows up to them, I was a little bit confused how did you know the ETS motif matched to the marked row?

A:
The rows in the figure correspond to different cancer cohorts or meta-cohorts. We also provide this information on the cancer cohort with significant differential burdening in Supplement 1 in the paper.

PCAWG passenger mutation analysis

Q1:
I was trying to download a subset of data from your recent paper (https://www.cell.com/cell/fulltext/S0092-8674(20)30113-6). However, the website is returning ‘not found’ error (http://pcawg.gersteinlab.org/). Especially, I am interested in ‘Gene list categories’. Therefore, I kindly request you to share relevant files listed under ‘Gene List Categories’ on the website, so I could use in my analysis.

A1:
The website works fine for me. Sure it doesn’t work ? … Please let me know which specific file are you trying to download.

Q2:
Thanks a lot for the reply.

I need the gene list categories listed under PCAWG-specific annotations (http://pcawg.gersteinlab.org/#Annotations)

Eseential Genes
Immune Response Genes
DNA repair Genes
Metabolic Genes
Cancer Pathway Genes
non-Essential Genes
cell Cycle Genes
For some reason, when I click on the link, it’s directly downloading the html file with error. It would be great if you could share these files.

A2:
You can download relevant files from the link listed below.

http://pcawg.gersteinlab.org/Datasets/Annotations/categories/

Referring to your paper: Structuring supplemental materials in support of reproducibility

Q:
I just read your paper mentioned above. I work in the area of
computational reproducibility so the paper was pretty interesting to
read. However, I stumbled a bit over one of your concluding remarks. You
are saying

"One useful tactic may be detailed sampling: perhaps it is best for the
editor to organize a system wherein, randomly, referees are asked to
review samples in greater detail to ensure the overall quality of the
supplements without quickly overwhelming the peer review system."

I am not sure whether I understood correctly how this could be
implemented. Does it mean that the editor randomly asks one of the
reviewers to look at the supplements, or do all reviewers look at
subsets of supplements? I find this idea pretty interesting and was
wondering whether you have published further articles on this topic?

A:
With respect to: "Does it mean that the editor randomly asks one of the reviewers to look at the supplements, or do all reviewers look at subsets of supplements?"
—> The former

With respect to: "I find this idea pretty interesting and was wondering whether you have published further articles on this topic?"
—> Not exactly.., but you might find useful the related work:
http://papers.gersteinlab.org/papers/structbl
http://papers.gersteinlab.org/papers/SDA