Consult for help about PseudoPipe

Q:
Why the genomic sequences need to be repeatmasked before their inputs to the pipeline?

A:
This is to block the low complexity regions in genome from pseudogene searching.

Q:
Which database we should use to do the repeatmasking?

A:
Our current pipeline downloads genome data from Ensembl, where the repeats are detected with the RepeatMasker tool. More information about the pseudopipe can be found at: https://faq.gersteinlab.org/category/pseudogenes/.

>
> Dear Prof. Gerstein,
>
> My name is Yiling Lai, a PhD student from Prof. Xingzhong Liu’s > group in Institute of Microbiology, Chinese Academy of Sciences. Our > research focus on comparative genomics of nematode endoparasitic > fungi Hirsutella spp.. Now we start to analyse the genomic sequences > and use the PseudoPipe from your published method to identify > pseudogenes in these genomes. However, some questions confuse us > when we use the pipeline. The first one is why the genomic sequences > need to be repeatmasked before their inputs to the pipeline. The > second question is which database we should use to do the > repeatmasking, the repbase database or database established from de > nove consensus sequences by RepeatScout? We would be very > appreciated if you could give us some good suggestions. Thank you > very much! We’re looking forward for your reply. >
>
> Best wishes
>
>
> Yiling Lai
>
>
> State Key Laboratory of Mycology
>
> Institute of Microbiology
>
> Chinese Academy of Sciences
>
> No.3 1st Beichen West Road, Chaoyang District
>
> Beijing 100101, PR China

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s