Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum

Authors: Tin Sum Cheng, Aurelien Lucchi, Anastasis Kratsios, David Belius

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We run several simple experiments to validate our theoretical analysis on overfitting. Figure 3 confirms that the condition number of the kernel matrix grows as described in Theorem 4.1: with smax smin : lambda1 lambdaN in the case of a polynomial spectrum and smax smin = ThetaN lambda1 lambdaN in the case of an exponential spectrum.
Researcher Affiliation Academia 1Department of Mathematics and Computer Science, University of Basel, Switzerland 2Department of Mathematics and Statistics, Mc Master University and Vector Institute, Canada 3Faculty of Mathematics and Computer Science, Uni Distance Suisse, Switzerland.
Pseudocode No The paper contains no explicit section or figure labeled "Pseudocode" or "Algorithm".
Open Source Code No The paper does not include a direct link to a source-code repository for their methodology, nor an explicit statement that their code is being released. It only mentions "his help with the code on NTK." in the acknowledgements.
Open Datasets No For simplicity, we implement the experiment following Assumption (Assumption 3.4). Let phi_k ~ N(0, Lambda) be i.i.d. Gaussian random vector with covariance Lambda = diag{lambda_k}. and In this scenario, the data follows a Gaussian distribution on the real line. The paper describes using synthetic data rather than a public dataset.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. It primarily uses simulated data.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments.
Experiment Setup Yes For each pair N and M = 10N, we run over 20 random samplings for the kernel matrix Phi Phi. and For each pair N and M = 10N, we run over 20 iterations for the same true coefficient.