COSGAP: COntainerized Statistical Genetics Analysis Pipelines
COSGAP: COntainerized Statistical Genetics Analysis Pipelines
About this item
Full title
Author / Creator
Akdeniz, Bayram Cevdet , Frei, Oleksandr , Hagen, Espen , Filiz, Tahir Tekin , Karthikeyan, Sandeep , Pasman, Joëlle , Jangmo, Andreas , Bergstedt, Jacob , Shorter, John R , Zetterberg, Richard , Meijsen, Joeri , Sønderby, Ida Elken , Buil, Alfonso , Tesli, Martin , Lu, Yi , Sullivan, Patrick , Andreassen, Ole A and Hovig, Eivind
Publisher
England: Oxford University Press
Journal title
Language
English
Formats
Publication information
Publisher
England: Oxford University Press
Subjects
More information
Scope and Contents
Contents
Abstract
Summary
The collection and analysis of sensitive data in large-scale consortia for statistical genetics is hampered by multiple challenges, due to their non-shareable nature. Time-consuming issues in installing software frequently arise due to different operating systems, software dependencies, and limited internet access. For federated analysis across sites, it can be challenging to resolve different problems, including format requirements, data wrangling, setting up analysis on high-performance computing (HPC) facilities, etc. Easier, more standardized, automated protocols and pipelines can be solutions to overcome these issues. We have developed one such solution for statistical genetic data analysis using software container technologies. This solution, named COSGAP: “COntainerized Statistical Genetics Analysis Pipelines,” consists of already established software tools placed into Singularity containers, alongside corresponding code and instructions on how to perform statistical genetic analyses, such as genome-wide association studies, polygenic scoring, LD score regression, Gaussian Mixture Models, and gene-set analysis. Using provided helper scripts written in Python, users can obtain auto-generated scripts to conduct the desired analysis either on HPC facilities or on a personal computer. COSGAP is actively being applied by users from different countries and projects to conduct genetic data analyses without spending much effort on software installation, converting data formats, and other technical requirements.
Availability and implementation
COSGAP is freely available on GitHub (https://github.com/comorment/containers) under the GPLv3 license....
Alternative Titles
Full title
COSGAP: COntainerized Statistical Genetics Analysis Pipelines
Authors, Artists and Contributors
Author / Creator
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11132817
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11132817
Other Identifiers
ISSN
2635-0041
E-ISSN
2635-0041
DOI
10.1093/bioadv/vbae067