Skip to Main content Skip to Navigation
Journal articles

WiseScaffolder: an algorithm for the semi-automatic scaffolding of Next Generation Sequencing data

Abstract : Background: The sequencing depth provided by high-throughput sequencing technologies has allowed a rise in the number of de novo sequenced genomes that could potentially be closed without further sequencing. However, genome scaffolding and closure require costly human supervision that often results in genomes being published as drafts. A number of automatic scaffolders were recently released, which improved the global quality of genomes published in the last few years. Yet, none of them reach the efficiency of manual scaffolding. Results: Here, we present an innovative semi-automatic scaffolder that additionally helps with chimerae resolution and generates valuable contig maps and outputs for manual improvement of the automatic scaffolding. This software was tested on the newly sequenced marine cyanobacterium Synechococcus sp. WH8103 as well as two reference datasets used in previous studies, Rhodobacter sphaeroides and Homo sapiens chromosome 14 (http://gage.cbcb.umd.edu/). The quality of resulting scaffolds was compared to that of three other stand-alone scaffolders: SSPACE, SOPRA and SCARPA. For all three model organisms, WiseScaffolder produced better results than other scaffolders in terms of contiguity statistics (number of genome fragments, N50, LG50, etc.) and, in the case of WH8103, the reliability of the scaffolds was confirmed by whole genome alignment against a closely related reference genome. We also propose an efficient computer-assisted strategy for manual improvement of the scaffolding, using outputs generated by WiseScaffolder, as well as for genome finishing that in our hands led to the circularization of the WH8103 genome. Conclusion: Altogether, WiseScaffolder proved more efficient than three other scaffolders for both prokaryotic and eukaryotic genomes and is thus likely applicable to most genome projects. The scaffolding pipeline described here should be of particular interest to biologists wishing to take advantage of the high added value of complete genomes.
Document type :
Journal articles
Complete list of metadata

Cited literature [33 references]  Display  Hide  Download

https://hal.sorbonne-universite.fr/hal-01193001
Contributor : Gestionnaire 2 HAL-UPMC Connect in order to contact the contributor
Submitted on : Friday, September 4, 2015 - 10:20:45 AM
Last modification on : Friday, January 21, 2022 - 3:27:33 AM
Long-term archiving on: : Saturday, December 5, 2015 - 11:47:50 AM

File

Farrant et al_2015_WiseScaffol...
Publication funded by an institution

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Gregory K. Farrant, Mark Hoebeke, Frédéric Partensky, Gwendoline Andres, Erwan Corre, et al.. WiseScaffolder: an algorithm for the semi-automatic scaffolding of Next Generation Sequencing data. BMC Bioinformatics, BioMed Central, 2015, 16 pp.281. ⟨10.1186/s12859-015-0705-y⟩. ⟨hal-01193001⟩

Share

Metrics

Record views

307

Files downloads

167