Log in to save to my catalogue

Selective Generation for Controllable Language Models

Selective Generation for Controllable Language Models

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2839576732

Selective Generation for Controllable Language Models

About this item

Full title

Selective Generation for Controllable Language Models

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2024-11

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

Subjects

More information

Scope and Contents

Contents

Trustworthiness of generative language models (GLMs) is crucial in their deployment to critical decision making systems. Hence, certified risk control methods such as selective prediction and conformal prediction have been applied to mitigating the hallucination problem in various supervised downstream tasks. However, the lack of appropriate correctness metric hinders applying such principled methods to language generation tasks. In this paper, we circumvent this problem by leveraging the concept of textual entailment to evaluate the correctness of the generated sequence, and propose two selective generation algorithms which control the false discovery rate with respect to the textual entailment relation (FDR-E) with a theoretical guarantee: \(\texttt{SGen}^{\texttt{Sup}}\) and \(\texttt{SGen}^{\texttt{Semi}}\). \(\texttt{SGen}^{\texttt{Sup}}\), a direct modification of the selective prediction, is a supervised learning algorithm which exploits entailment-labeled data, annotated by humans. Since human annotation is costly, we further propose a semi-supervised version, \(\texttt{SGen}^{\texttt{Semi}}\), which fully utilizes the unlabeled data by pseudo-labeling, leveraging an entailment set function learned via conformal prediction. Furthermore, \(\texttt{SGen}^{\texttt{Semi}}\) enables to use more general class of selection functions, neuro-selection functions, and provides users with an optimal selection function class given multiple candidates. Finally, we demonstrate the efficacy of the \(\texttt{SGen}\) family in achieving a desired FDR-E level with comparable selection efficiency to those from baselines on both open and closed source GLMs. Code and datasets are provided at https://github.com/ml-postech/selective-generation....

Alternative Titles

Full title

Selective Generation for Controllable Language Models

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_2839576732

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2839576732

Other Identifiers

E-ISSN

2331-8422

How to access this item