Q:
I am developing a pipeline to analyze the Hi-C data from the PsychEncode project. As a sanity check, I want to map the enhancer-Transcription start sites (TSS) pairs from the file http://resource.psychencode.org/Datasets/Integrative/INT-16_HiC_EP_linkages.csv to the TADs inferred by the Psychencode project in the file http://resource.psychencode.org/Datasets/Derived/DER-18_TAD_adultbrain.bed.
Looking at the enhancer and TSS, the TSS have very "round" coordinates (e.g. 90000, 630000, etc). Just to confirm, those are still genomic coordinates, right?
Also, are the coordinates of the TADs genomic coordinates, or Hi-C bins? I assumed that was the case, but could not find any of the enhancer-TSS pairs in the same TAD, which is what I expected.
A:
RE your questions:
Looking at the enhancer and TSS, the TSS have very "round" coordinates (e.g. 90000, 630000, etc). Just to confirm, those are still genomic coordinates, right?
-> Yes. I used the resolution for Hi-C (in 10kb resolution), not the actual TSS. So you can simply overlap the TSS coordinates with the actual promoter coordinates to link genes to enhancers.
Also, are the coordinates of the TADs genomic coordinates, or Hi-C bins? I assumed that was the case, but could not find any of the enhancer-TSS pairs in the same TAD, which is what I expected.
-> TAD coordinates should also be the genomic coordinates, not Hi-C bins. It’s odd that you didn’t find enhancer-TSS pairs in the same TAD because we found >70% of E-P links are located within TADs..