Generalization Through the Lens of Leave-One-Out Error

Authors: Gregor Bachmann, Thomas Hofmann, Aurelien Lucchi

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show both theoretically and empirically that the leave-one-out error is capable of capturing various phenomena in generalization theory, such as double descent, random labels or transfer learning.
Researcher Affiliation Academia Gregor Bachmann a, Thomas Hofmanna, and Aur elien Lucchib a Department of Computer Science, ETH Z urich, Switzerland b Department of Mathematics and Computer Science, University of Basel
Pseudocode No The paper does not contain any sections explicitly labeled as "Pseudocode" or "Algorithm", nor does it present any structured algorithmic steps in a code-like format.
Open Source Code Yes We release the code for our numerical experiments on Github1. 1https://github.com/gregorbachmann/Leave One Out
Open Datasets Yes We evaluate the models on the benchmark vision datasets MNIST (Le Cun & Cortes, 2010) and CIFAR10 (Krizhevsky & Hinton, 2009).
Dataset Splits No The paper mentions varying sample sizes (n) for training and using test sets for evaluation, and also discusses K-fold validation as a general technique. However, it does not provide specific train/validation/test split percentages, sample counts for each split, or explicit details about their validation strategy for their own experiments beyond evaluating on a test set.
Hardware Specification No The paper does not specify the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No All experiments are performed in Jax (Bradbury et al., 2018) using the neural-tangents library (Novak et al., 2020).
Experiment Setup No The paper describes the models used (e.g., "5-layer fully-connected NTK model", "1-layer random feature model", "Res Net18", "Alex Net", "VGG", "Dense Net") and the general task (fine-tuning top layer). However, it does not provide specific hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings for these experiments.