Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression
Authors: Will Stephenson, Zachary Frangella, Madeleine Udell, Tamara Broderick
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically confirm our theory using simulated experiments. |
| Researcher Affiliation | Academia | William T. Stephenson MIT wtstephe@mit.edu Zachary Frangella Cornell zjf4@cornell.edu Madeleine Udell Cornell udell@cornell.edu Tamara Broderick MIT tbroderick@mit.edu |
| Pseudocode | No | The paper describes conceptual steps and refers to optimization algorithms but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | Our first dataset contains N = 2,938 observations of life expectancy, along with D = 20 covariates such as country of origin or alcohol use; see ?? for a full description. ... Our second dataset consists of recorded wine quality of N = 1,599 red wines. The goal is to predict wine quality from D = 11 observed covariates relating to the chemical properties of each wine; see ?? for a full description. |
| Dataset Splits | Yes | Here we study the leave-one-out CV (LOOCV) loss |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models. |
| Software Dependencies | No | The only software dependency for our experiments is Num Py [Harris et al., 2020], which uses the BSD 3-Clause New or Revised License. (This does not include a specific version number for NumPy itself, only the publication year of the paper describing it.) |
| Experiment Setup | Yes | We fix D = 5. To generate various spectra of X, we set Sd = eαd/eαD. For each α, we sample 100 left-singular-value matrices U from the uniform distribution... We fix a unit-norm θ RD and for each U, we generate data from a well-specified linear model, yn = xn, θ + εn, where the εn are drawn i.i.d. from N(0, σ2) with variance σ2 = 0.5. In particular, for each setting of U, we generate 100 vectors Y. For each setting of U and Y, we compute L and check whether it is quasiconvex. |