Q:
I have downloaded the psiDR for comparing the results with previously posted
lincRNAs at the UCSC web site.
I found that the following entry doesn’t match with the current hg19
positions assigned at the UCSC genome browser:
gene_id "ENSG00000224184.1"; transcript_id "ENSG00000224184.1"; gene_type
"lincRNA"; gene_status "NOVEL"; gene_name "AC096559.1"; transcript_type
"lincRNA"; transcript_status "NOVEL"; transcript_name "AC096559.1"; level 2;
tag "ncRNA_host"; havana_gene "OTTHUMG00000151709.2";
Coordinates at psiDR are: chr2:11,988,748-12,718,474
Coordinates at UCSC are: chr2:12,716,164-12,783,038
Don’t know whether or not that happens with the coordinates of other
elements.
I can’t find a way to explain this difference other than a mistake in the
annotation process, but maybe I’m wrong and there is a better explanation.
A:
We use the GENCODE gene annotation model. If you check Ensembl for "ENSG00000224184.1", you will see that it matches the coordinates at psidDR.
I think the UCSC track includes the actual clone boundaries. You can e-mail to the UCSC help desk. They are generally very responsive. Please bear in mind that coordinates also change a bit with updated genome assembly as well refined gene annotation models.