We are basically looking for the pseudogenes of protein P53 (tumor protein 53, or tumor suppressor) and protein WSTF (also call it as BAZ1B) in human species. There have no information in Pseudogene.org. Could you please help us to find a way to get the result?
Later on I found one webservice, which is called PseudoGeneQuest, and I submitted my target protein sequences and I got the results as shown in the following forwarded emails.
The results showed that there are known-pseudogenes in your database, however, I couldn’t extract the data out. Could you please help me to do so?
We are basically looking for the pseudogenes of protein P53 (tumor protein 53, or tumor suppressor) and protein WSTF (also call it as BAZ1B) in human species.
I have looked at our pseudogene database and there are no pseudogenes for P53 and WSTF. I have further rechecked this by redoing homology analysis to the genome based on both P53 and WSTF sequence and there are no other regions in the genome which are good hits to P53 and WSTF. I have also looked at the results from the other program and either the matches are to other coding exons of other genes or all they are not significant matches, i.e. the match-lengths are very small and the e-values are not significant.
For example, these are the other regions in the genome homologous to the coding sequence in BLAST. Please see attached image. The only significant matches to P53 proteins are
This corresponds to P53 itself
2. NT_004350.19 This corresponds to P73, another gene and not a pseudogene
3. NT_005612.16 This corresponds to P63, another gene and not a pseudogene
The other two matches are not significant matches and have length homology only to 20% of P53.
This is the result that you obtained from the other program.
0 - QUERY:111222153038348410812 2 - KNOWN_PSEUDOGENE:ref|NT_004350.19|:NT_010755.15:3118600:3119076 2 - KNOWN_PSEUDOGENE:ref|NT_004350.19|:NT_033903.7:3114083:3118495 2 - KNOWN_PSEUDOGENE:ref|NT_010718.16|:NT_008470.18:7177265:7178188 2 - KNOWN_PSEUDOGENE:ref|NT_010718.16|:NT_023935.17:7181340:7182403 2 - KNOWN_PSEUDOGENE:ref|NT_010718.16|:NT_079573.3:7181224:7182633 3 - REAL GENE OR EXON:ref|NT_004350.19|:3122278:3122442 3 - REAL GENE OR EXON:ref|NT_005612.16|:96077137:96077361 3 - REAL GENE OR EXON:ref|NT_005612.16|:96079592:96079771 3 - REAL GENE OR EXON:ref|NT_005612.16|:96080735:96080899 3 - REAL GENE OR EXON:ref|NT_005612.16|:96081483:96081638 3 - REAL GENE OR EXON:ref|NT_010718.16|:7176274:7176414 3 - REAL GENE OR EXON:ref|NT_010718.16|:7180194:7180331 3 - REAL GENE OR EXON:ref|NT_010718.16|:7180364:7180564 3 - REAL GENE OR EXON:ref|NT_010718.16|:7180845:7181012 3 - REAL GENE OR EXON:ref|NT_010718.16|:7183182:7183316
So all the good hits are to coding exons of P53 or P63 or P73 presumably because P53 is homologous to P63, P73 etc.
Similarly for WSTF, the other matches are either to known genes or the matches are not significant. You can easily check this by querying your protein sequence using BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&PROG_DEF=blastn&BLAST_PROG_DEF=megaBlast&SHOW_DEFAULTS=on&SHOW_DEFAULTS=on&BLAST_SPEC=OGP__9606__9558)