In-context Learning and Induction Heads
In-context Learning and Induction Heads
About this item
Full title
Author / Creator
Olsson, Catherine , Nelson Elhage , Nanda, Neel , Nicholas, Joseph , DasSarma, Nova , Henighan, Tom , Mann, Ben , Askell, Amanda , Bai, Yuntao , Chen, Anna , Conerly, Tom , Drain, Dawn , Ganguli, Deep , Hatfield-Dodds, Zac , Hernandez, Danny , Johnston, Scott , Jones, Andy , Jackson Kernion , Lovitt, Liane , Ndousse, Kamal , Amodei, Dario , Brown, Tom , Clark, Jack , Kaplan, Jared , McCandlish, Sam and Olah, Chris
Publisher
Ithaca: Cornell University Library, arXiv.org
Journal title
Language
English
Formats
Publication information
Publisher
Ithaca: Cornell University Library, arXiv.org
Subjects
More information
Scope and Contents
Contents
"Induction heads" are attention heads that implement a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transformer models (i.e. decreasing los...
Alternative Titles
Full title
In-context Learning and Induction Heads
Authors, Artists and Contributors
Author / Creator
Nelson Elhage
Nanda, Neel
Nicholas, Joseph
DasSarma, Nova
Henighan, Tom
Mann, Ben
Askell, Amanda
Bai, Yuntao
Chen, Anna
Conerly, Tom
Drain, Dawn
Ganguli, Deep
Hatfield-Dodds, Zac
Hernandez, Danny
Johnston, Scott
Jones, Andy
Jackson Kernion
Lovitt, Liane
Ndousse, Kamal
Amodei, Dario
Brown, Tom
Clark, Jack
Kaplan, Jared
McCandlish, Sam
Olah, Chris
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_proquest_journals_2718479676
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2718479676
Other Identifiers
E-ISSN
2331-8422