Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Computational Efficiency under Covariate Shift in Kernel Ridge Regression

Authors: Andrea Della Vecchia, Arnaud Mavakala Watusadisi, Ernesto De Vito, Lorenzo Rosasco

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we validate these theoretical findings through simulations and real data experiments. (from Contribution section); 7 Simulations and real data experiments (Section title)
Researcher Affiliation Academia Andrea Della Vecchia Swiss Finance Institute (SFI) EPFL EMAIL, Arnaud Mavakala Watusadisi Watusadisi Ma LGa, DIBRIS University of Genova EMAIL, Ernesto De Vito Ma LGa, DIMA University of Genova EMAIL, Lorenzo Rosasco Ma LGa, DIBRIS University of Genova Istituto Italiano di Tecnologia EMAIL
Pseudocode No The paper describes methods and derivations but does not include explicit pseudocode or algorithm blocks.
Open Source Code No Answer: [No] Justification: datasets are open access, see Appendix D. The implemented algorithms are classic models implementable with standard tools such as sklearn.
Open Datasets Yes The HHAR dataset (Stisen et al., 2015), accessible on DR-NTU (Data), dataset DOI: https://doi.org/10.21979/N9/OWDFXO, version used: 3.0 (May 27, 2022), license: CC BY-NC 4.0, authors of the dataset version: Mohamed Ragab and Emadeldeen Eldele. (similar entries for other datasets in Appendix D)
Dataset Splits Yes User A has 1,218,871 samples designated as training data, from which 15,500 samples were randomly selected, while user H provides the test data. (HHAR dataset section in Appendix D). Similar details for other datasets. We optimize the hyperparameters using hold-out cross-validation, partitioning the dataset into 70% for training and 30% for validation.
Hardware Specification Yes The experiments were conducted in Python on a 2018 Mac Book Pro with a 2.3 GHz Intel Core i5 Quad-Core processor, 16GB of RAM, and no GPU.
Software Dependencies No The experiments were conducted in Python on a 2018 Mac Book Pro with a 2.3 GHz Intel Core i5 Quad-Core processor, 16GB of RAM, and no GPU. (and from Checklist Q5 justification: The implemented algorithms are classic models implementable with standard tools such as sklearn.)
Experiment Setup Yes We used λ1 = 10 4, λQ = 1 and Q = 10 to generate the geometric grid. We also propose a method for generating a geometric sequence of γ values, derived from distinct formulas for even and odd indices. Remark 2. The following values represent the parameter γ for the Gaussian kernel: ( 10 3+ k 1 /2 , k odd, 5 10 3+ k 2 /2 , k even, k = 1, 2, . . . , 6. We optimize the hyperparameters using hold-out cross-validation, partitioning the dataset into 70% for training and 30% for validation. For each combination of hyperparameters λ and γ, we train the model on Xtrain and ytrain and validate it on Xval and yval.