Log in to save to my catalogue

Integrative approaches to improve the informativeness of deep learning models for human complex dise...

Integrative approaches to improve the informativeness of deep learning models for human complex dise...

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2508591880

Integrative approaches to improve the informativeness of deep learning models for human complex diseases

About this item

Full title

Integrative approaches to improve the informativeness of deep learning models for human complex diseases

Publisher

Cold Spring Harbor: Cold Spring Harbor Laboratory Press

Journal title

bioRxiv, 2021-08

Language

English

Formats

Publication information

Publisher

Cold Spring Harbor: Cold Spring Harbor Laboratory Press

More information

Scope and Contents

Contents

Deep learning models have achieved great success in predicting genome-wide regulatory effects from DNA sequence, but recent work has reported that SNP annotations derived from these predictions contribute limited unique information for human complex disease. Here, we explore three integrative approaches to improve the disease informativeness of allelic-effect annotations (predicted difference between reference and variant alleles) constructed using several previously trained deep learning models: DeepSEA, Basenji and DeepBind (and a related machine learning model, deltaSVM). First, we employ gradient boosting to learn optimal combinations of deep learning annotations, using fine-mapped SNPs and matched control SNPs (on held-out chromosomes) for training. Second, we improve the specificity of these annotations by restricting them to SNPs implicated by (proximal and distal) SNP-to-gene (S2G) linking strategies, e.g. prioritizing SNPs involved in gene regulation. Third, we predict gene expression (and derive allelic-effect annotations) from deep learning annotations at SNPs implicated by S2G linking strategies | generalizing the previously proposed ExPecto approach, which in-corporates deep learning annotations based on distance to TSS. We evaluated these approaches using stratified LD score regression, using functional data in blood and focusing on 11 autoimmune diseases and blood-related traits (average N=306K). We determined that the three approaches produced SNP annotations that were uniquely informative for these diseases/traits, despite the fact that linear combinations of the underlying DeepSEA, Basenji, DeepBind and deltaSVM blood annotations were not uniquely informative for these diseases/traits. Our results highlight the benefits of integrating SNP annotations produced by deep learning models with other types of data, including data linking SNPs to genes. Competing Interest Statement The authors have declared no competing interest. Footnotes * Following reviewer response, we have expanded the set of models from 2 deep learning models (DeepSEA and Basenji) to 4 deep learning/machine learning-based sequence models (DeepSEA, Basenji, DeepBind, deltaSVM). We have also updated the text to clarify the comparisons across methods and the features underlying the performance of these methods in greater detail. * https://github.com/kkdey/Imperio * https://alkesgroup.broadinstitute.org/LDSCORE/DeepLearning/Dey_DeepBoost_Imperio/...

Alternative Titles

Full title

Integrative approaches to improve the informativeness of deep learning models for human complex diseases

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_2508591880

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2508591880

Other Identifiers

E-ISSN

2692-8205

DOI

10.1101/2020.09.08.288563