Generalization Through the Lens of Leave-One-Out Error
Authors: Gregor Bachmann, Thomas Hofmann, Aurelien Lucchi
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show both theoretically and empirically that the leave-one-out error is capable of capturing various phenomena in generalization theory, such as double descent, random labels or transfer learning. |
| Researcher Affiliation | Academia | Gregor Bachmann a, Thomas Hofmanna, and Aur elien Lucchib a Department of Computer Science, ETH Z urich, Switzerland b Department of Mathematics and Computer Science, University of Basel |
| Pseudocode | No | The paper does not contain any sections explicitly labeled as "Pseudocode" or "Algorithm", nor does it present any structured algorithmic steps in a code-like format. |
| Open Source Code | Yes | We release the code for our numerical experiments on Github1. 1https://github.com/gregorbachmann/Leave One Out |
| Open Datasets | Yes | We evaluate the models on the benchmark vision datasets MNIST (Le Cun & Cortes, 2010) and CIFAR10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | No | The paper mentions varying sample sizes (n) for training and using test sets for evaluation, and also discusses K-fold validation as a general technique. However, it does not provide specific train/validation/test split percentages, sample counts for each split, or explicit details about their validation strategy for their own experiments beyond evaluating on a test set. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | All experiments are performed in Jax (Bradbury et al., 2018) using the neural-tangents library (Novak et al., 2020). |
| Experiment Setup | No | The paper describes the models used (e.g., "5-layer fully-connected NTK model", "1-layer random feature model", "Res Net18", "Alex Net", "VGG", "Dense Net") and the general task (fine-tuning top layer). However, it does not provide specific hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings for these experiments. |