Scaling Laws and Interpretability of Learning from Repeated Data
Scaling Laws and Interpretability of Learning from Repeated Data
About this item
Full title
Author / Creator
Publisher
Ithaca: Cornell University Library, arXiv.org
Journal title
Language
English
Formats
Publication information
Publisher
Ithaca: Cornell University Library, arXiv.org
Subjects
More information
Scope and Contents
Contents
Recent large language models have been trained on vast datasets, but also often on repeated data, either intentionally for the purpose of upweighting higher quality data, or unintentionally because data deduplication is not perfect and the model is exposed to repeated data at the sentence, paragraph, or document level. Some works have reported subs...
Alternative Titles
Full title
Scaling Laws and Interpretability of Learning from Repeated Data
Authors, Artists and Contributors
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_proquest_journals_2668579778
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2668579778
Other Identifiers
E-ISSN
2331-8422