Human pseudogene annotation

Q:
In your recent Nature Communication report of mouse pseudogenes (https://doi.org/10.1038/s41467-020-17157-w), you stated that “For human, we used a similar workflow to refine the reference pseudogene annotation to a high-quality set of 14,650 pseudogenes.” I wonder if you could kindly share the chromosome coordinate information of these 14,650 pseudogenes with me? I am investigating the distribution patterns of RNA-editing sites in human genome, and I cannot find a good source of pseudogene definition. A database named Pseudogene.org is too old and not based on GRCh38.

A:
In the paper we have worked with the GENCODE consortia to refine the pseudogene annotation. Since the paper publication we have continued to improve the human pseudogene annotation using a combination of manual and automatic pipelines as described in the paper. Attached is the pseudogene coordinates for the complete set of pseudogenes.

For a definition fo pseudogene i suggest you use our paper https://genomebiology.biomedcentral.com/articles/10.1186/gb-2012-13-9-r51 that defines pseudogenes as defunct genomic loci with sequence similarity to functional genes but lacking coding potential due to the presence of disruptive mutations such as frame shifts and premature stop codons.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s