reproducibilityindex.ai

Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum

Authors: Tin Sum Cheng, Aurelien Lucchi, Anastasis Kratsios, David Belius

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We run several simple experiments to validate our theoretical analysis on overfitting. Figure 3 confirms that the condition number of the kernel matrix grows as described in Theorem 4.1: with smax smin : lambda1 lambdaN in the case of a polynomial spectrum and smax smin = ThetaN lambda1 lambdaN in the case of an exponential spectrum.
Researcher Affiliation	Academia	1Department of Mathematics and Computer Science, University of Basel, Switzerland 2Department of Mathematics and Statistics, Mc Master University and Vector Institute, Canada 3Faculty of Mathematics and Computer Science, Uni Distance Suisse, Switzerland.
Pseudocode	No	The paper contains no explicit section or figure labeled "Pseudocode" or "Algorithm".
Open Source Code	No	The paper does not include a direct link to a source-code repository for their methodology, nor an explicit statement that their code is being released. It only mentions "his help with the code on NTK." in the acknowledgements.
Open Datasets	No	For simplicity, we implement the experiment following Assumption (Assumption 3.4). Let phi_k ~ N(0, Lambda) be i.i.d. Gaussian random vector with covariance Lambda = diag{lambda_k}. and In this scenario, the data follows a Gaussian distribution on the real line. The paper describes using synthetic data rather than a public dataset.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. It primarily uses simulated data.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments.
Experiment Setup	Yes	For each pair N and M = 10N, we run over 20 random samplings for the kernel matrix Phi Phi. and For each pair N and M = 10N, we run over 20 iterations for the same true coefficient.