Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Modern Look at the Relationship between Sharpness and Generalization

Authors: Maksym Andriushchenko, Francesco Croce, Maximilian Mรผller, Matthias Hein, Nicolas Flammarion

ICML 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We comprehensively explore this question in a detailed study of various definitions of adaptive sharpness in settings ranging from training from scratch on Image Net and CIFAR10 to fine-tuning CLIP on Image Net and BERT on MNLI.
Researcher Affiliation Academia 1EPFL 2T ubingen AI Center 3University of T ubingen. Correspondence to: Maksym Andriushchenko <EMAIL>.
Pseudocode Yes For convenience we restate the algorithm of Auto-PGD in Algorithm 1
Open Source Code Yes Our code is available at https://github.com/tml-epfl/ sharpness-vs-generalization.
Open Datasets Yes We comprehensively explore this question in a detailed study of various definitions of adaptive sharpness in settings ranging from training from scratch on Image Net and CIFAR10 to fine-tuning CLIP on Image Net and BERT on MNLI.
Dataset Splits No No explicit statement found providing specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and test sets. The paper assumes standard splits for public datasets or refers to models from other works.
Hardware Specification No No specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments are provided in the paper.
Software Dependencies No The paper mentions using 'vit-pytorch library' for ViT architecture and 'Auto-PGD' for sharpness evaluation, but does not provide specific version numbers for these or other software dependencies like PyTorch, Python, or CUDA.
Experiment Setup Yes We train models for 200 epochs using SGD with momentum and linearly decreasing learning rates after a linear warm-up for the first 40% iterations. We vary the learning rate, ฯ {0, 0.05, 0.1} of SAM (Foret et al., 2021), mixup (ฮฑ = 0.5) (Zhang et al., 2018), and standard augmentations combined with Rand Augment (Cubuk et al., 2020).