No Double Descent in Principal Component Regression: A High-Dimensional Analysis
Authors: Daniel Gedon, Antonio H. Ribeiro, Thomas B. Schön
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical findings are empirically validated in simulation, demonstrating their practical relevance. |
| Researcher Affiliation | Academia | 1Department of Information Technology, Uppsala University, Sweden. Correspondence to: Daniel Gedon <daniel.gedon@it.uu.se>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We provide code to reproduce the numerical simulations https://github.com/dgedon/PCR_spiked_covariance. |
| Open Datasets | Yes | For a real-world data example, we use the Diverse MAGIC wheat data set (Scott et al., 2021). |
| Dataset Splits | No | The paper describes varying parameters p and n and their ratio gamma in simulations, but does not provide specific training, validation, and test dataset splits with percentages or counts. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | For the data-generating process, we choose the parameters α = 0.1, σε = 0.1, rθ = 1 and ρx = 1. We choose d = 10 spikes and vary k to see the effect of model misspecification. For our simulations, we choose n = 500 and set p accordingly to fulfill γ = p n. We vary γ [0.1, 30], i.e. from low-dimensional γ < 1 to high-dimensional γ > 1. We compute the risk Eν [R(θ)] and present median values of the simulation results from 50 realizations as well as 25%, 75% quantiles. |