Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization

Authors: Antonio Ribeiro, David Vävinggren, Dave Zachariah, Thomas B Schön, Francis Bach

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluation shows good performance in both clean and adversarial settings. We validate the method on real and simulated data. We make our implementation available at github.com/antonior92/adversarial training kernel. Numerical experiments
Researcher Affiliation	Academia	Antˆonio H. Ribeiro Uppsala University EMAIL David V avinggren Uppsala University EMAIL Dave Zachariah Uppsala University EMAIL Thomas B. Sch on Uppsala University EMAIL Francis Bach PSL Research University / INRIA EMAIL
Pseudocode	Yes	Algorithm 1 Iterative Kernel Ridge Regression Initialize: weights wi 1, i = 1, . . . , n; and, λ δ Repeat: 1. Solve reweighted kernel ridge regression: bf arg min f 1 n i=1 wi(yi f(xi))2 + λ f 2 H 2. Update weights (using Eq. (9)): w, λ Update Weights(bf) 3. Quit if Stop Criteria.
Open Source Code	Yes	Our contributions are to propose and analyze feature-perturbed adversarial kernel training. We: ... We make our implementation available at github.com/antonior92/adversarial training kernel.
Open Datasets	Yes	We used 5 different data sets in our experiments. Diabetes: (Efron et al., 2004) The dataset has p=10 baseline variables (age, sex, body mass index, average blood pressure, and six blood serum measurements), which were obtained for n=442 diabetes patients. Abalone: (Open ML ID=30, UCI ID=1) Predicting the age of abalone from p=8 physical measurements. Wine quality: (Cortez et al., 2009, UCI ID=186) A large dataset (n=4898) with white and red vinho verde samples (from Portugal) used to predict human wine taste preferences. Polution: (Mc Donald & Schwing, 1973, Open ML ID=542) Estimates relating air pollution to mortality. US crime: (Redmond & Baveja, 2002, Open ML ID=42730, UCI ID=182) This dataset combines p=127 features that come from socio-economic data from the US Census, law enforcement data from the LEMAS survey, and crime data from the FBI for n=1994 comunities.
Dataset Splits	No	The paper mentions 'test set R2 scores' which implies data splits, but does not provide specific details such as percentages, sample counts, or explicit methodology for how these splits were performed for reproducibility. For example, there's no mention of 80/20 train/test splits or specific file names for predefined splits.
Hardware Specification	No	This is not a very computationally intensive paper. All the experiments can run on a personal computer, hence we don t put a lot of emphasis in this aspect.
Software Dependencies	No	The paper does not explicitly state any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experiments.
Experiment Setup	Yes	For kernel ridge regression, hyperparameters are selected via cross-validation over γ {10, 1, 0.1, 10 2, 10 3} and λ {1, 0.1, 10 2, 10 3}. For adversarial kernel training, we use a default adversarial radius and select γ from the same range using cross-validation. For comparison, we also include an input-space adversarial training baseline with δtrain = 0.05, following Equation (1), trained for 300 epochs using Adam.