Log in to save to my catalogue

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3121360204

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

About this item

Full title

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2024-10

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

More information

Scope and Contents

Contents

Recent studies have identified one aggravating factor of LLM hallucinations as the knowledge inconsistency between pre-training and fine-tuning, where unfamiliar fine-tuning data mislead the LLM to fabricate plausible but wrong outputs. In this paper, we propose a novel fine-tuning strategy called Prereq-Tune to address this knowledge inconsistency and reduce hallucinations. Fundamentally, Prereq-Tune disentangles the learning of skills and knowledge, so the model learns only the task skills without being impacted by the knowledge inconsistency. To achieve this, Prereq-Tune introduces an additional prerequisite learning stage to learn the necessary knowledge for SFT, allowing subsequent SFT to focus only on task skills. Prereq-Tune can also be combined with fictitious synthetic data to enhance the grounding of LLM outputs to their internal knowledge. Experiments show that Prereq-Tune outperforms existing baselines in improving LLM's factuality across short QA and long-form generation tasks. It also opens new possibilities for knowledge-controlled generation in LLMs. Our code is available at https://github.com/UCSB-NLP-Chang/Prereq_tune.git....

Alternative Titles

Full title

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_3121360204

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3121360204

Other Identifiers

E-ISSN

2331-8422

How to access this item