reproducibilityindex.ai

Least Squares Regression Can Exhibit Under-Parameterized Double Descent

Authors: Xinyue Li, Rishi Sonthalia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We numerically verify the predictions from Theorems 1 and 2. Figure 2 shows that the theoretically predicted risk matches the numerical risk, thus verifying that double descent occurs in the under-parameterized regime.
Researcher Affiliation	Academia	Xinyue Li Applied Math, Yale University xinyue.li.xl728@yale.edu Rishi Sonthalia Math, Boston College rishi.sonthalia@bc.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	All code for the experiments can be found at https://github.com/rsonthal/Under-Parameterized-Double Descent
Open Datasets	Yes	All data used is synthetic or already open source. [...] We verify the conjectured formula as well as the role of low dimensionality using MNIST data.
Dataset Splits	No	The paper primarily focuses on theoretical analysis and empirical verification with synthetic data, and for these, it does not specify train/validation splits. For the MNIST experiment, it mentions using "complete MNIST test data" but does not detail how the training data was prepared with a validation split.
Hardware Specification	Yes	All experiments were conducted using Pytorch and run on Google Colab using an A100 GPU.
Software Dependencies	No	The paper mentions using "Pytorch" but does not specify its version or the versions of any other software libraries or dependencies.
Experiment Setup	Yes	For each configuration of the parameters, Ntrn, Ntst, d, σtrn, σtst, and µ. For each trial, we sampled u, vtrn, vtst uniformly at random from the appropriate dimensional sphere. We also sampled new training and test noise for each trial. For the data scaling regime, we kept d = 1000 and for the parameter scaling regime, we kept Ntrn = 1000. For all experiments, Ntst = 1000.