Log in to save to my catalogue

The GIAB genomic stratifications resource for human reference genomes

The GIAB genomic stratifications resource for human reference genomes

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_cea87d7d5cfe495988b6700aa70955a2

The GIAB genomic stratifications resource for human reference genomes

About this item

Full title

The GIAB genomic stratifications resource for human reference genomes

Publisher

London: Nature Publishing Group UK

Journal title

Nature communications, 2024-10, Vol.15 (1), p.9029-13, Article 9029

Language

English

Formats

Publication information

Publisher

London: Nature Publishing Group UK

More information

Scope and Contents

Contents

Despite the growing variety of sequencing and variant-calling tools, no workflow performs equally well across the entire human genome. Understanding context-dependent performance is critical for enabling researchers, clinicians, and developers to make informed tradeoffs when selecting sequencing hardware and software. Here we describe a set of “stratifications,” which are BED files that define distinct contexts throughout the genome. We define these for GRCh37/38 as well as the new T2T-CHM13 reference, adding many new hard-to-sequence regions which are critical for understanding performance as the field progresses. Specifically, we highlight the increase in hard-to-map and GC-rich stratifications in CHM13 relative to the previous references. We then compare the benchmarking performance with each reference and show the performance penalty brought about by these additional difficult regions in CHM13. Additionally, we demonstrate how the stratifications can track context-specific improvements over different platform iterations, using Oxford Nanopore Technologies as an example. The means to generate these stratifications are available as a snakemake pipeline at
https://github.com/usnistgov/giab-stratifications
. We anticipate this being useful in enabling precise risk-reward calculations when building sequencing pipelines for any of the commonly-used reference genomes.
The GIAB genomic stratification resource defines challenging regions in three commonly used human genome references, including the first complete human genome (CHM13). These hel...

Alternative Titles

Full title

The GIAB genomic stratifications resource for human reference genomes

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_doaj_primary_oai_doaj_org_article_cea87d7d5cfe495988b6700aa70955a2

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_cea87d7d5cfe495988b6700aa70955a2

Other Identifiers

ISSN

2041-1723

E-ISSN

2041-1723

DOI

10.1038/s41467-024-53260-y

How to access this item