I am using PseudoPipe to find pseudogenes from a query Chromosome. I have a chromosome nucleotide sequence file and a protein sequences file.
I am not getting what is MySQL file and how can get this and one more file of masking.
PseudoPipe is configured to run on nucleotide and protein sequence files as formatted and available for download from the ensembl server.
Regarding your issues:
1. A MySQL file is a file dowloaded from a MySQL database , and thus has it’s specific format. Ensemble uses this database to store exons co-ordinates for all the protein coding genes starting with an exon id, chromosome number, start and end position, strand, etc . As such I suggest you format your exons information accordingly . As example you can use the” chrI_exLocs” file located in the mysql folder from the C.elegans example that you downloaded along with pseudopipe.
2. A masking file is a nucleotide files (in fasta format) that masks all the repeat sequences from the genome. If you want to create it yourself you should use a repeat masker and format it accordingly to the file that you see in the dna folder in the C.elegans example dna_rm.fa .