MorphServer job

Q:
I am trying to use Morph Server, my job ID: b198308-29491
It has been running for 5 days and is still not completed. Could you please check?

A:
The files in the folder of your jobs seems to indicate that it
finished successfully. You can find the files using

http://www.molmovdb.org/uploads/JobID

So yours would be in

http://www.molmovdb.org/uploads/b198308-29491/

Unfortunately, some of the other features of the web interface need
maintenance.

HiC-spector space complexity

Q:
I am currently running HiC-spector on mouse genome datasets with bin size 5kb. I noticed that it requires quite a lot of memory, so I was wondering if there were tests done on HiC-spector’s space complexity, as I couldn’t find such studies in the Supplementary Data.

A:
We didn’t do analysis explicitly. Because the contact maps are stored as sparse matrices, the memory won’t grow quadratically. In general, if calculation is done chromosome by chromosome, 5kb should be fine.

pnas paper supplement duplication

Q1:
I am reading with interest your recent paper (Kumar, Clarke, and Gerstein, PNAS), but I suspect that supplement 1 and 2 are the same, and neither has a list of 434 genes. Could you please supply the list?

A1:
Thank you very much for your interest in the paper. Supplement 1 includes hotspot communities based on pan-cancer analysis (i.e., when will compute statistics over multiple cancer cohorts in TCGA). In contrast, supplement 2 lists out putative driver genes with hotspot communities for specific cancer types. If you note in supplement2, column F list out the name of particular cancer cohorts.

Regarding the number of genes, 434 genes are based on the pan-cancer analysis.
For each gene, there are multiple PDB entries. For analysis in our paper, we selected a representative structure with the highest residue coverage. However, to be exhaustive and allow researchers to analyze protein of their interest, in our supplement, we include all PDB entries for a given gene. We have tried to explain this in our method section.

Q2:
Thanks for your quick reply; but, no, this does not remove my confusion. Please take a moment to check the link from your paper at PNAS. When I download pnas.1901156116.sd01.xlsx, the file has 217 lines (not 434) and includes the column F that breaksdown by cancer type.

A2:
I am attaching our original tables with the email. It appears that the table has been somehow duplicated on the PNAS website. We will work with the PNAS team to get it fixed.

Supplemental_tables.xlsx

Supplementary data of Architecture of the human regulatory network derived from ENCODE data

Q:
I recently read the ENCODE paper "Architecture of the human regulatory network derived from ENCODE data", and I realized that the supplementary data will greatly help me to refine projects results, in particular those files related to the K562. Unfortunately, I found that all the supplementary data files are not available to download, since both of the following sites can’t be reached.

http://encodenets.gersteinlab.org
http://archive.gersteinlab.org/proj/encodenetsold

In particular, the second link is active, but if I try to download one of the files, it points to the first link and the download is interrupted. I am writing to ask if there are any other ways to access the files.

A:
http://encodenets.gersteinlab.org should be back up now. Let us know

Funseq2 Web Server

Q:
The Funseq2 Web Server goes down these days. Would it be available in the next few days?

A:
The Funseq2 web server is up and running now. It has some suspicious activity on the server recently and we are keeping on monitoring it.
If you are submitting your own query, please try to use the correct format, or it will shows ‘service unavailable’ service.

As an alternative, you can also download the whole genome annotations for both hg19 and hg38 from funseq3.gersteinlab.org, then use tabix to query.