A deep database of medical abbreviations and acronyms for natural language processing
A deep database of medical abbreviations and acronyms for natural language processing
About this item
Full title
Author / Creator
Publisher
London: Nature Publishing Group UK
Journal title
Language
English
Formats
Publication information
Publisher
London: Nature Publishing Group UK
Subjects
More information
Scope and Contents
Contents
The recognition, disambiguation, and expansion of medical abbreviations and acronyms is of upmost importance to prevent medically-dangerous misinterpretation in natural language processing. To support recognition, disambiguation, and expansion, we present the Medical Abbreviation and Acronym Meta-Inventory, a deep database of medical abbreviations. A systematic harmonization of eight source inventories across multiple healthcare specialties and settings identified 104,057 abbreviations with 170,426 corresponding senses. Automated cross-mapping of synonymous records using state-of-the-art machine learning reduced redundancy, which simplifies future application. Additional features include semi-automated quality control to remove errors. The Meta-Inventory demonstrated high completeness or
coverage
of abbreviations and senses in new clinical text, a substantial improvement over the next largest repository (6–14% increase in abbreviation coverage; 28–52% increase in sense coverage). To our knowledge, the Meta-Inventory is the most complete compilation of medical abbreviations and acronyms in American English to-date. The multiple sources and high coverage support application in varied specialties and settings. This allows for cross-institutional natural language processing, which previous inventories did not support. The Meta-Inventory is available at
https://bit.ly/github-clinical-abbreviations
.
Measurement(s)
Controlled Vocabulary • Linguistic Form
Technology Type(s)
digital curation • data combination
Sample Characteristic - Location
United States of America
Machine-accessible metadata file describing the reported data:
https://doi.or...
Alternative Titles
Full title
A deep database of medical abbreviations and acronyms for natural language processing
Authors, Artists and Contributors
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_doaj_primary_oai_doaj_org_article_22f299c14aa7499aaf230bed4f8e868b
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_doaj_primary_oai_doaj_org_article_22f299c14aa7499aaf230bed4f8e868b
Other Identifiers
ISSN
2052-4463
E-ISSN
2052-4463
DOI
10.1038/s41597-021-00929-4