Q:
I am using PseudoPipe to find pseudogenes from a query Chromosome. I have a chromosome nucleotide sequence file and a protein sequences file.
I am not getting what is MySQL file and how can get this and one more file of masking.
A:
PseudoPipe is configured to run on nucleotide and protein sequence files as formatted and available for download from the ensembl server.
Regarding your issues:
1. A MySQL file is a file dowloaded from a MySQL database , and thus has it’s specific format. Ensemble uses this database to store exons co-ordinates for all the protein coding genes starting with an exon id, chromosome number, start and end position, strand, etc . As such I suggest you format your exons information accordingly . As example you can use the” chrI_exLocs” file located in the mysql folder from the C.elegans example that you downloaded along with pseudopipe.
2. A masking file is a nucleotide files (in fasta format) that masks all the repeat sequences from the genome. If you want to create it yourself you should use a repeat masker and format it accordingly to the file that you see in the dna folder in the C.elegans example dna_rm.fa .