Least Squares Regression Can Exhibit Under-Parameterized Double Descent
Authors: Xinyue Li, Rishi Sonthalia
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We numerically verify the predictions from Theorems 1 and 2. Figure 2 shows that the theoretically predicted risk matches the numerical risk, thus verifying that double descent occurs in the under-parameterized regime. |
| Researcher Affiliation | Academia | Xinyue Li Applied Math, Yale University xinyue.li.xl728@yale.edu Rishi Sonthalia Math, Boston College rishi.sonthalia@bc.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All code for the experiments can be found at https://github.com/rsonthal/Under-Parameterized-Double Descent |
| Open Datasets | Yes | All data used is synthetic or already open source. [...] We verify the conjectured formula as well as the role of low dimensionality using MNIST data. |
| Dataset Splits | No | The paper primarily focuses on theoretical analysis and empirical verification with synthetic data, and for these, it does not specify train/validation splits. For the MNIST experiment, it mentions using "complete MNIST test data" but does not detail how the training data was prepared with a validation split. |
| Hardware Specification | Yes | All experiments were conducted using Pytorch and run on Google Colab using an A100 GPU. |
| Software Dependencies | No | The paper mentions using "Pytorch" but does not specify its version or the versions of any other software libraries or dependencies. |
| Experiment Setup | Yes | For each configuration of the parameters, Ntrn, Ntst, d, σtrn, σtst, and µ. For each trial, we sampled u, vtrn, vtst uniformly at random from the appropriate dimensional sphere. We also sampled new training and test noise for each trial. For the data scaling regime, we kept d = 1000 and for the parameter scaling regime, we kept Ntrn = 1000. For all experiments, Ntst = 1000. |