Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum
Authors: Tin Sum Cheng, Aurelien Lucchi, Anastasis Kratsios, David Belius
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run several simple experiments to validate our theoretical analysis on overfitting. Figure 3 confirms that the condition number of the kernel matrix grows as described in Theorem 4.1: with smax smin : lambda1 lambdaN in the case of a polynomial spectrum and smax smin = ThetaN lambda1 lambdaN in the case of an exponential spectrum. |
| Researcher Affiliation | Academia | 1Department of Mathematics and Computer Science, University of Basel, Switzerland 2Department of Mathematics and Statistics, Mc Master University and Vector Institute, Canada 3Faculty of Mathematics and Computer Science, Uni Distance Suisse, Switzerland. |
| Pseudocode | No | The paper contains no explicit section or figure labeled "Pseudocode" or "Algorithm". |
| Open Source Code | No | The paper does not include a direct link to a source-code repository for their methodology, nor an explicit statement that their code is being released. It only mentions "his help with the code on NTK." in the acknowledgements. |
| Open Datasets | No | For simplicity, we implement the experiment following Assumption (Assumption 3.4). Let phi_k ~ N(0, Lambda) be i.i.d. Gaussian random vector with covariance Lambda = diag{lambda_k}. and In this scenario, the data follows a Gaussian distribution on the real line. The paper describes using synthetic data rather than a public dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. It primarily uses simulated data. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments. |
| Experiment Setup | Yes | For each pair N and M = 10N, we run over 20 random samplings for the kernel matrix Phi Phi. and For each pair N and M = 10N, we run over 20 iterations for the same true coefficient. |