SparSNP: fast and memory-efficient analysis of all SNPs for phenotype prediction
SparSNP: fast and memory-efficient analysis of all SNPs for phenotype prediction
About this item
Full title
Author / Creator
Publisher
England: BioMed Central Ltd
Journal title
Language
English
Formats
Publication information
Publisher
England: BioMed Central Ltd
Subjects
More information
Scope and Contents
Contents
A central goal of genomics is to predict phenotypic variation from genetic variation. Fitting predictive models to genome-wide and whole genome single nucleotide polymorphism (SNP) profiles allows us to estimate the predictive power of the SNPs and potentially develop diagnostic models for disease. However, many current datasets cannot be analysed with standard tools due to their large size.
We introduce SparSNP, a tool for fitting lasso linear models for massive SNP datasets quickly and with very low memory requirements. In analysis on a large celiac disease case/control dataset, we show that SparSNP runs substantially faster than four other state-of-the-art tools for fitting large scale penalised models. SparSNP was one of only two tools that could successfully fit models to the entire celiac disease dataset, and it did so with superior performance. Compared with the other tools, the models generated by SparSNP had better than or equal to predictive performance in cross-validation.
Genomic datasets are rapidly increasing in size, rendering existing approaches to model fitting impractical due to their prohibitive time or memory requirements. This study shows that SparSNP is an essential addition to the genomic analysis toolkit.SparSNP is available at http://www.genomics.csse.unimelb.edu.au/SparSNP....
Alternative Titles
Full title
SparSNP: fast and memory-efficient analysis of all SNPs for phenotype prediction
Authors, Artists and Contributors
Author / Creator
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_doaj_primary_oai_doaj_org_article_df22487d410b410e87eb702995f3153b
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_df22487d410b410e87eb702995f3153b
Other Identifiers
ISSN
1471-2105
E-ISSN
1471-2105
DOI
10.1186/1471-2105-13-88