I am using Breakseq to find mechanism of structure variations (SV) mapped using different package. I got stuck while using svMech module, probably due to lack of its user manual.
I only want to find mechanism of SV, so I have commented Ancestral state and feature analysis in annotate script under bin directory of breakseq.
It is working fine if I give only deletions in gff file. But when I give Insertions in gff file, it exits with following error
********** Creating standard breakpoint library **********
Traceback (most recent call last):
File "/home/pankaj/breakseq/breakseq-1.3/bin/svUtil/svStd.py", line 20, in <module>
File "/home/pankaj/breakseq/breakseq-1.3/lib/biopy/io/SV.py", line 103, in get_sequence
return self.base.get_sequence(self.name, self.start, self.end)
AttributeError: ‘NoneType’ object has no attribute ‘get_sequence’
Command exited with non-zero status 1
0.13user 0.04system 0:00.21elapsed 83%CPU (0avgtext+0avgdata 60800maxresident)k
0inputs+8outputs (0major+4306minor)pagefaults 0swaps
Could you please resolve my following queries regarding breakseq
(1) For Insertion, Do I need to provide inserted sequence explicitly or does this package find internally.
(2) Does this package also find mechanism of translocations. If yes, which keyword should I use in 3rd column of gff file.
I read your excellent breakSeq paper "Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library", and now I have some whole genome sequencing data to be analyzed. The breakpoint library you apply (http://sv.gersteinlab.org/breakseq/) is based on human genome NCBI build 36, but I use NCBI build 37 now. So should I lift-over the coordinate to the NCBI build 37 or realign the junction sequences to the NCBI build 37 first by myself? Or is there any pre-compiled breakpoint junction library used for NCBI build 37 ? By the way, any suggestions about adding the SVs identified in 1000 genome project to the breakpoint junction library ?
There are two sets of SV breakpoints that should be relevant to you:
The published pilot data is on NCBI build 36. Using liftover to convert the genomic coordinates to NCBI build 37 should suffice. You might want to double check whether the SV size and the junction sequences are consistent before and after the liftover.
The phase I data is on NCBI build 37. You may simply take the junction sequences at the breakpoints to add to the library.