Q1:
I am interested in using the information in your database to design PCR probes that would recognize usable and ensuing pseudogenes for several genes.
Do I need to obtain any type of written permission to use this information?
A1:
Nope!
Q2:
I checked one gene, with 9 pseudogenes listed and tried to align the
sequences to make PCR primers to detect the 10 copies, however, I realized
that being a bit naïve about pseudogenes led me down the wrong path, as I
thought the sequences would be more similar, and adept to being used to
estimate copy number for inserting foreign genes. While I did get regions
that hit 3-6 of the 10 genes, it wasn’t consistent enough.
I was wondering if you have the data about % conservation or any types of
algorithms that would predict the % conservation of pseudogene to gene and
pull out those names/gene Ids and number of pseudogenes?
A2:
It would be helpful if you can tell us a bit more about what you are trying to do.
I assume you are looking at human pseudogenes. We do have percent identity between the parent protein and the pseudogene.
Q3:
I’m trying to figure out a sensible way to use the numbers of the pseudogene/gene as a natural standard curve for real time PCR. See attached excel file. I chose at random genes with 9 to1 listed pseudogene which theoretically would allow me to target endogenous genes of different copy number and get some type of standard curve. This is assuming equal efficiency etc.
I didn’t pay attention to the column "Identity" but now I’m thinking I can sort out genes based on high identity and try again?
A3:
I think that identity should be taken into account when you are creating the standard curse. Also, note that in the excel file, there is a column of fraction (after gene ID), which indicates the fraction of a parent gene aligned to its pseudogene. The start and end coordinates of an alignment are also in the excel file (columns between protein ID and gene ID). Maybe you want to take these into consideration too.