reproducibilityindex.ai

A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression

Authors: Tin Sum Cheng, Aurelien Lucchi, Anastasis Kratsios, David Belius

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper conducts a comprehensive study of the learning curves of kernel ridge regression (KRR) under minimal assumptions. In Figure 3, our experiment demonstrates the GEP, as the learning curves with kernel features (Sine features) and independent features (Gaussian features z N(0, Ip) or Rademacher features z (unif{ 1})p) coincide and match the theoretical decay.
Researcher Affiliation	Academia	Tin Sum Cheng, Aurelien Lucchi Department of Mathematics and Computer Science University of Basel, Switzerland tinsum.cheng@unibas.ch, aurelien.lucchi@unibas.ch Anastasis Kratsios Department of Mathematics Mc Master University and The Vector Institute Ontario, Canada kratsioa@mcmaster.ca David Belius Faculty of Mathematics and Computer Science Uni Distance Suisse Switzerland david.belius@cantab.ch
Pseudocode	No	The paper provides a 'Proof sketch' in Section 4 and a 'flowchart in Figure 4' outlining proof techniques, but it does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The code for the experiments is uploaded as supplementary materials.
Open Datasets	No	The experiments are based on synthetically generated data using specified parameters (e.g., 'µ = unif[0, 1]'), not a publicly accessible dataset with explicit access information (link, DOI, formal citation).
Dataset Splits	No	The paper mentions a 'sample size n ranges from 100 to 1000' and discusses 'test error', but it does not specify explicit train/validation/test dataset splits (percentages, counts, or predefined splits) for its synthetic data.
Hardware Specification	Yes	All experiments were conducted on a computer with a 2.3 GHz Quad-Core Intel Core i7 processor.
Software Dependencies	No	The paper does not provide specific software dependencies or library versions (e.g., 'PyTorch 1.9' or 'NumPy 1.20') used for the experiments.
Experiment Setup	Yes	In the following experiment, we choose p = 2000, and the sample size n ranges from 100 to 1000, with ridge parameter λ = ( 2n 1 2 π) b where b [0, 1 + a].