Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression
Authors: Will Stephenson, Zachary Frangella, Madeleine Udell, Tamara Broderick
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically confirm our theory using simulated experiments. |
| Researcher Affiliation | Academia | William T. Stephenson MIT EMAIL Zachary Frangella Cornell EMAIL Madeleine Udell Cornell EMAIL Tamara Broderick MIT EMAIL |
| Pseudocode | No | The paper describes conceptual steps and refers to optimization algorithms but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | Our first dataset contains N = 2,938 observations of life expectancy, along with D = 20 covariates such as country of origin or alcohol use; see ?? for a full description. ... Our second dataset consists of recorded wine quality of N = 1,599 red wines. The goal is to predict wine quality from D = 11 observed covariates relating to the chemical properties of each wine; see ?? for a full description. |
| Dataset Splits | Yes | Here we study the leave-one-out CV (LOOCV) loss |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models. |
| Software Dependencies | No | The only software dependency for our experiments is Num Py [Harris et al., 2020], which uses the BSD 3-Clause New or Revised License. (This does not include a specific version number for NumPy itself, only the publication year of the paper describing it.) |
| Experiment Setup | Yes | We fix D = 5. To generate various spectra of X, we set Sd = eαd/eαD. For each α, we sample 100 left-singular-value matrices U from the uniform distribution... We fix a unit-norm θ RD and for each U, we generate data from a well-specified linear model, yn = xn, θ + εn, where the εn are drawn i.i.d. from N(0, σ2) with variance σ2 = 0.5. In particular, for each setting of U, we generate 100 vectors Y. For each setting of U and Y, we compute L and check whether it is quasiconvex. |