reproducibilityindex.ai

A Tale of Tails: Model Collapse as a Change of Scaling Laws

Authors: Elvis Dohmatob, Yunzhen Feng, Pu Yang, Francois Charton, Julia Kempe

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.
Researcher Affiliation	Collaboration	1Meta FAIR 2Center for Data Science, New York University 3School of Mathematical Sciences, Peking University 4Courant Institute, New York University.
Pseudocode	No	The paper describes various models and algorithms but does not provide any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not state that its own code is open-source or provide a link to a repository for the described methodology.
Open Datasets	Yes	We empirically verify these theoretical predictions (see Figure 4): (1) in large-scale experiments on an LLM, fine-tuning Llama2-7B (Touvron et al., 2023) on an approximately 2M sample dataset from Wikitext-103
Dataset Splits	No	The paper mentions training on a dataset and evaluating on a test set, but does not specify explicit train/validation/test dataset splits or their percentages/counts.
Hardware Specification	No	This work was supported in part through the NYU IT High Performance Computing resources, services, and staff expertise. This is too general and lacks specific hardware details like GPU or CPU models.
Software Dependencies	No	The paper mentions software like Llama2-7B, LoRA, and Adam optimizer, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	Throughout the finetuning process, we maintain consistent settings using learning rate 5e 5 for Lo RA, using Adam optimizer, dropout rate 0.1, trainable parameter fraction 0.062%.