reproducibilityindex.ai

No Double Descent in Principal Component Regression: A High-Dimensional Analysis

Authors: Daniel Gedon, Antonio H. Ribeiro, Thomas B. Schön

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical findings are empirically validated in simulation, demonstrating their practical relevance.
Researcher Affiliation	Academia	1Department of Information Technology, Uppsala University, Sweden. Correspondence to: Daniel Gedon <daniel.gedon@it.uu.se>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	We provide code to reproduce the numerical simulations https://github.com/dgedon/PCR_spiked_covariance.
Open Datasets	Yes	For a real-world data example, we use the Diverse MAGIC wheat data set (Scott et al., 2021).
Dataset Splits	No	The paper describes varying parameters p and n and their ratio gamma in simulations, but does not provide specific training, validation, and test dataset splits with percentages or counts.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For the data-generating process, we choose the parameters α = 0.1, σε = 0.1, rθ = 1 and ρx = 1. We choose d = 10 spikes and vary k to see the effect of model misspecification. For our simulations, we choose n = 500 and set p accordingly to fulfill γ = p n. We vary γ [0.1, 30], i.e. from low-dimensional γ < 1 to high-dimensional γ > 1. We compute the risk Eν [R(θ)] and present median values of the simulation results from 50 realizations as well as 25%, 75% quantiles.