Pushing the limits of de novo genome assembly for complex prokaryotic genomes harboring very long, near identical repeats
Generating a complete, de novo genome assembly for prokaryotes is often considered a solved problem. However, we here show that Pseudomonas koreensis P19E3 harbors multiple, near identical repeat pairs up to 70 kilobase pairs in length. Beyond long repeats, the P19E3 assembly was further complicated by a shufflon region. Its complex genome could not be de novo assembled with long reads produced by Pacific Biosciences technology, but required very long reads from the Oxford Nanopore Technology. Another important factor for a full genomic resolution was the choice of assembly algorithm. Importantly, a repeat analysis indicated that very complex bacterial genomes represent a general phenomenon beyond Pseudomonas. Roughly 10% of 9331 complete bacterial and a handful of 293 complete archaeal genomes represented this dark matter for de novo genome assembly of prokaryotes. Several of these dark matter genome assemblies contained repeats far beyond the resolution of the sequencing technology employed and likely contain errors, other genomes were closed employing labor-intense steps like cosmid libraries, primer walking or optical mapping. Using very long sequencing reads in combination with assemblers capable of resolving long, near identical repeats will bring most prokaryotic genomes within reach of fast and complete de novo genome assembly.
SubjectsDe novo genome assembly
Showing items related by title, author, creator and subject.
Merkel, Angelika (University of Canterbury. School of Biological Sciences, 2008)Microsatellites are short (1-6bp long) highly polymorphic tandem repeats, found in all genomes analyzed so far. Popular genetic markers for many applications including population genetics, pedigree analysis, genetic mapping ...
Pushing the limits of de novo genome assembly for complex prokaryotic genomes harboring very long, near identical repeats. Schmid M; Frei D; Patrignani A; Schlapbach R; Frey JE; Remus-Emsermann MNP; Ahrens CH (2018)Generating a complete, de novo genome assembly for prokaryotes is often considered a solved problem. However, we here show that Pseudomonas koreensis P19E3 harbors multiple, near identical repeat pairs up to 70 kilobase ...
Complex Recombination Patterns Arising during Geminivirus Coinfections Preserve and Demarcate Biologically Important Intra-Genome Interaction Networks Martin, D.P.; Lefeuvre, P.; Varsani, A.; Hoareau, M.; Semegni, J.Y.; Dijoux, B.; Vincent, C.; Reynaud, B.; Lett, J-M. (University of Canterbury. Biological Sciences, 2011)Genetic recombination is an important process during the evolution of many virus species and occurs particularly frequently amongst begomoviruses in the single stranded DNA virus family, Geminiviridae. As in many other ...