Q:
I would like to ask about what kind of indels are incorporated into the
diploid genome assembly of the NA12878 individual, available from your lab:
http://sv.gersteinlab.org/NA12878_diploid/NA12878_diploid_dec16.2012.zip
In the readme it says that 829,454 indels were used to construct this
genome. What makes me confused is that when I perform a BLAST search with
one 1.7 kb deletion from NA12878.2010_06.and.fosmid.deletions.phased.vcf
(P2_M_061510_21_73), it shows up in both the maternal and paternal
haplotypes. Is there any size cutoff used for the indels that have been
selected for this assembly?
A:
Unfortunately, in the latest version, no fosmid indels/SVs were used; only the variant output of GATK Best Practices v3 was used, even though fosmid data was indeed used to construct the earlier versions of the diploid genome. We might include them in the future. Thank you.