prefix ‘chr’ in liftOver

Q:

From the documentation @http://info.gersteinlab.org/AlleleSeq:

Chain files

Using the chain file, one can use the LifeOver tool to convert the annotation coordinates from reference genome to personal haplotypes.

However, when I tried to liftOver my bed file using maternal.chain, all returned unMapped.

249242013 1

10329 1 0

109 1 0

30199 3 0

My bed file:

chr1 14541 14542

chr1 14652 14653

chr1 14676 14677

chr1 14906 14907

A:

It looked like the liftOver failed because of using different chromosome naming convention in .bed and .chain files. In .bed file chromosomes are named with prefix ‘chr’, while in chain files they don’t have such prefix.

fosmid indel

Q:

I would like to ask about what kind of indels are incorporated into the

diploid genome assembly of the NA12878 individual, available from your lab:

http://sv.gersteinlab.org/NA12878_diploid/NA12878_diploid_dec16.2012.zip

In the readme it says that 829,454 indels were used to construct this

genome. What makes me confused is that when I perform a BLAST search with

one 1.7 kb deletion from NA12878.2010_06.and.fosmid.deletions.phased.vcf

(P2_M_061510_21_73), it shows up in both the maternal and paternal

haplotypes. Is there any size cutoff used for the indels that have been

selected for this assembly?

A:

Unfortunately, in the latest version, no fosmid indels/SVs were used; only the variant output of GATK Best Practices v3 was used, even though fosmid data was indeed used to construct the earlier versions of the diploid genome. We might include them in the future. Thank you.

Do you need parentsÂ’ genotype data?

Q:

I am looking for a tool to detect allele specific expression from resequencing and RNA-seq data. I find AlleleSeq could be quite powerful. I noticed the input for the software needs parents genotype data; it requires a VCF file which contains trio genotype to create maternal and paternal genome. But in my case, if I only have genotype information from a single individual, how could I use AlleleSeq?

A:

You dont have to genotype parents. You only need to have variants phased in any way you can/wish (vcf2diploid tool only looks at one column with info for the individual of interest and does not consider other columns). Having trio sequenced is an easy and, probably, the best way to do it.

If you have the mothers genotype only, then you can phase a good fraction of heterozygous variants. Each unphased variants will be randomly assigned to a particular haplotype, so half of them will also be correct. And, of course, all homozygous variants will be phased.

Mismatches between the paternal and maternal chromosomes

Q:
I believe I have discovered numerous errors in the NA12878 dataset. We are working with the most recent version,
NA12878_diploid_genome_may3_2011. They are all single base pair mismatches between the paternal and maternal chromosomes in regions that the accompanying .map file marks as contigs.

A:
.map file shows continuous equivalent (without gaps) blocks between haplotypes. BUT THEY DO INCLUDE SNPs. So, heterozygous SNPs will result in base mismatch within a block.

Using CNVnator

Q:
CNVnator is a very popular software as observed though there is no official guide on CNVnator or any directions available on how to get started with CNVnator.Could you be kind enough to provide me with the same, please? Does your license allow to provide commercial services based on your program?

A:
Please download the software and read README file.

Alex Abyzov

Information in .root file

Q:
By using CNVnator, I managed to create the .root file but from there I can’t go any further because when I try to create the histograms, it seem to be working, but it never creates any files after it’s done.
A:
New information is added to the .root file you provided in the command line.
During next calculation step CNVnator will extract this information from the file.

To browse the content of the .root file you can start ROOT and open browser (type “new TBrowser”).

Please see http://root.cern.ch for details.

CNVnator license

Q:
Does your license allow to provide commercial services based on your program?

A:
Commercial services can use CNVnator for free provided that original software/developers/paper is credited/cited.

Alex Abyzov

***********************************************************
Department of Molecular Biophysics and Biochemistry,
Yale University, 260 Whitney ave., P.O. Box 208114,
New Haven, CT, 06520, USA
Phone: 1-(203)-432-5405
e-mail: abyzov@gersteinlab.org
URL: http://homes.gersteinlab.org/people/aabyzov
***********************************************************