Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Least Squares Regression Can Exhibit Under-Parameterized Double Descent

Authors: Xinyue Li, Rishi Sonthalia

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We numerically verify the predictions from Theorems 1 and 2. Figure 2 shows that the theoretically predicted risk matches the numerical risk, thus verifying that double descent occurs in the under-parameterized regime.
Researcher Affiliation Academia Xinyue Li Applied Math, Yale University EMAIL Rishi Sonthalia Math, Boston College EMAIL
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes All code for the experiments can be found at https://github.com/rsonthal/Under-Parameterized-Double Descent
Open Datasets Yes All data used is synthetic or already open source. [...] We verify the conjectured formula as well as the role of low dimensionality using MNIST data.
Dataset Splits No The paper primarily focuses on theoretical analysis and empirical verification with synthetic data, and for these, it does not specify train/validation splits. For the MNIST experiment, it mentions using "complete MNIST test data" but does not detail how the training data was prepared with a validation split.
Hardware Specification Yes All experiments were conducted using Pytorch and run on Google Colab using an A100 GPU.
Software Dependencies No The paper mentions using "Pytorch" but does not specify its version or the versions of any other software libraries or dependencies.
Experiment Setup Yes For each configuration of the parameters, Ntrn, Ntst, d, ฯƒtrn, ฯƒtst, and ยต. For each trial, we sampled u, vtrn, vtst uniformly at random from the appropriate dimensional sphere. We also sampled new training and test noise for each trial. For the data scaling regime, we kept d = 1000 and for the parameter scaling regime, we kept Ntrn = 1000. For all experiments, Ntst = 1000.