Model Collapse Demystified: The Case of Regression

Authors: Elvis Dohmatob, Yunzhen Feng, Julia Kempe

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical results are validated with experiments.
Researcher Affiliation Collaboration FAIR, Meta Center for Data Science, New York University Courant Institue of Mathematical Sciences, New York University
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No We only use one publicly available dataset, MNIST, and no idiosyncratic model. Thus, we provide neither dataset nor code, as the dataset is publicly available, and the experiments are easy to reproduce from their description.
Open Datasets Yes We conduct experiments using kernel ridge regression on the MNIST dataset [16]
Dataset Splits No The classification dataset contains 60, 000 training and 10, 000 test data points (handwritten), with labels from 0 to 9 inclusive.
Hardware Specification No No specific hardware details (GPU/CPU models, memory) were mentioned for the experimental setup. The acknowledgments only vaguely refer to 'NYU IT High Performance Computing (HPC) resources, services, and staff expertise'.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python, PyTorch, scikit-learn versions) were mentioned in the paper.
Experiment Setup Yes Specifically, the models were trained using stochastic gradient descent (SGD) with a batch size of 128 and a learning rate of 0.1. We employed a regression setting where labels were converted to one-hot vectors, and the model was trained using mean squared error for 200 epochs to convergence. When generating the synthetic data, Gaussian label noise with a standard deviation of 0.1 is added.