High-quality draft assemblies of mammalian genomes from massively parallel sequence data
High-quality draft assemblies of mammalian genomes from massively parallel sequence data
About this item
Full title
Author / Creator
Gnerre, Sante , MacCallum, Iain , Przybylski, Dariusz , Ribeiro, Filipe J , Burton, Joshua N , Walker, Bruce J , Sharpe, Ted , Hall, Giles , Shea, Terrance P , Sykes, Sean , Berlin, Aaron M , Aird, Daniel , Costello, Maura , Daza, Riza , Williams, Louise , Nicol, Robert , Gnirke, Andreas , Nusbaum, Chad , Lander, Eric S and Jaffe, David B
Publisher
United States: National Academy of Sciences
Journal title
Language
English
Formats
Publication information
Publisher
United States: National Academy of Sciences
Subjects
More information
Scope and Contents
Contents
Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (approximately 100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd....
Alternative Titles
Full title
High-quality draft assemblies of mammalian genomes from massively parallel sequence data
Authors, Artists and Contributors
Author / Creator
MacCallum, Iain
Przybylski, Dariusz
Ribeiro, Filipe J
Burton, Joshua N
Walker, Bruce J
Sharpe, Ted
Hall, Giles
Shea, Terrance P
Sykes, Sean
Berlin, Aaron M
Aird, Daniel
Costello, Maura
Daza, Riza
Williams, Louise
Nicol, Robert
Gnirke, Andreas
Nusbaum, Chad
Lander, Eric S
Jaffe, David B
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_jstor_primary_41001892
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_jstor_primary_41001892
Other Identifiers
ISSN
0027-8424
E-ISSN
1091-6490
DOI
10.1073/pnas.1017351108