Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression
Authors: Will Stephenson, Zachary Frangella, Madeleine Udell, Tamara Broderick
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically confirm our theory using simulated experiments. |
| Researcher Affiliation | Academia | William T. Stephenson MIT EMAIL Zachary Frangella Cornell EMAIL Madeleine Udell Cornell EMAIL Tamara Broderick MIT EMAIL |
| Pseudocode | No | The paper describes conceptual steps and refers to optimization algorithms but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | Our first dataset contains N = 2,938 observations of life expectancy, along with D = 20 covariates such as country of origin or alcohol use; see ?? for a full description. ... Our second dataset consists of recorded wine quality of N = 1,599 red wines. The goal is to predict wine quality from D = 11 observed covariates relating to the chemical properties of each wine; see ?? for a full description. |
| Dataset Splits | Yes | Here we study the leave-one-out CV (LOOCV) loss |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models. |
| Software Dependencies | No | The only software dependency for our experiments is Num Py [Harris et al., 2020], which uses the BSD 3-Clause New or Revised License. (This does not include a specific version number for NumPy itself, only the publication year of the paper describing it.) |
| Experiment Setup | Yes | We fix D = 5. To generate various spectra of X, we set Sd = eαd/eαD. For each α, we sample 100 left-singular-value matrices U from the uniform distribution... We fix a unit-norm θ RD and for each U, we generate data from a well-specified linear model, yn = xn, θ + εn, where the εn are drawn i.i.d. from N(0, σ2) with variance σ2 = 0.5. In particular, for each setting of U, we generate 100 vectors Y. For each setting of U and Y, we compute L and check whether it is quasiconvex. |