Log in to save to my catalogue

Efficient inference of homologs in large eukaryotic pan-proteomes

Efficient inference of homologs in large eukaryotic pan-proteomes

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_d81c481b08054cdfa979ca5635f2eb31

Efficient inference of homologs in large eukaryotic pan-proteomes

About this item

Full title

Efficient inference of homologs in large eukaryotic pan-proteomes

Publisher

England: BioMed Central Ltd

Journal title

BMC bioinformatics, 2018-09, Vol.19 (1), p.340-340, Article 340

Language

English

Formats

Publication information

Publisher

England: BioMed Central Ltd

More information

Scope and Contents

Contents

Identification of homologous genes is fundamental to comparative genomics, functional genomics and phylogenomics. Extensive public homology databases are of great value for investigating homology but need to be continually updated to incorporate new sequences. As new sequences are rapidly being generated, there is a need for efficient standalone tools to detect homologs in novel data.
To address this, we present a fast method for detecting homology groups across a large number of individuals and/or species. We adopted a k-mer based approach which considerably reduces the number of pairwise protein alignments without sacrificing sensitivity. We demonstrate accuracy, scalability, efficiency and applicability of the presented method for detecting homology in large proteomes of bacteria, fungi, plants and Metazoa.
We clearly observed the trade-off between recall and precision in our homology inference. Favoring recall or precision strongly depends on the application. The clustering behavior of our program can be optimized for particular applications by altering a few key parameters. The program is available for public use at https://github.com/sheikhizadeh/pantools as an extension to our pan-genomic analysis tool, PanTools....

Alternative Titles

Full title

Efficient inference of homologs in large eukaryotic pan-proteomes

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_doaj_primary_oai_doaj_org_article_d81c481b08054cdfa979ca5635f2eb31

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_d81c481b08054cdfa979ca5635f2eb31

Other Identifiers

ISSN

1471-2105

E-ISSN

1471-2105

DOI

10.1186/s12859-018-2362-4

How to access this item