No Double Descent in Principal Component Regression: A High-Dimensional Analysis

Authors: Daniel Gedon, Antonio H. Ribeiro, Thomas B. Schön

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical findings are empirically validated in simulation, demonstrating their practical relevance.
Researcher Affiliation Academia 1Department of Information Technology, Uppsala University, Sweden. Correspondence to: Daniel Gedon <daniel.gedon@it.uu.se>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes We provide code to reproduce the numerical simulations https://github.com/dgedon/PCR_spiked_covariance.
Open Datasets Yes For a real-world data example, we use the Diverse MAGIC wheat data set (Scott et al., 2021).
Dataset Splits No The paper describes varying parameters p and n and their ratio gamma in simulations, but does not provide specific training, validation, and test dataset splits with percentages or counts.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes For the data-generating process, we choose the parameters α = 0.1, σε = 0.1, rθ = 1 and ρx = 1. We choose d = 10 spikes and vary k to see the effect of model misspecification. For our simulations, we choose n = 500 and set p accordingly to fulfill γ = p n. We vary γ [0.1, 30], i.e. from low-dimensional γ < 1 to high-dimensional γ > 1. We compute the risk Eν [R(θ)] and present median values of the simulation results from 50 realizations as well as 25%, 75% quantiles.