Log in to save to my catalogue

DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Datas...

DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Datas...

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3128886757

DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine

About this item

Full title

DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2024-11

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

More information

Scope and Contents

Contents

We introduce DAHL, a benchmark dataset and automated evaluation system designed to assess hallucination in long-form text generation, specifically within the biomedical domain. Our benchmark dataset, meticulously curated from biomedical research papers, consists of 8,573 questions across 29 categories. DAHL evaluates fact-conflicting hallucinations...

Alternative Titles

Full title

DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_3128886757

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3128886757

Other Identifiers

E-ISSN

2331-8422

How to access this item