Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Tradeoffs in Data Augmentation: An Empirical Study
Authors: Raphael Gontijo-Lopes, Sylvia Smullin, Ekin Dogus Cubuk, Ethan Dyer
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Inspired by these, we conduct an empirical study to quantify how data augmentation improves model generalization. We present an empirical study of 204 different augmentations on CIFAR-10 and 225 on Image Net, varying both broad transform families and finer transform parameters. |
| Researcher Affiliation | Industry | Raphael Gontijo-Lopes Google Brain EMAIL Sylvia J. Smullin Blueshift, Alphabet Ekin D. Cubuk Google Brain EMAIL Ethan Dyer Blueshift, Alphabet EMAIL |
| Pseudocode | No | The paper describes methods in prose and with mathematical definitions, but does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states it used code based on an existing open-source project ('Cifar10 models were trained using code based on Auto Augment code2... 2available at github.com/tensorflow/models/tree/master/research/autoaugment'), but it does not explicitly state that the specific code for their described methodology (e.g., the Affinity and Diversity metrics' implementation or their experimental scripts) is open-source or provided. |
| Open Datasets | Yes | We present an empirical study of 204 different augmentations on CIFAR-10 and 225 on Image Net... |
| Dataset Splits | Yes | Validation set was the last 5000 samples of the shuffled CIFAR-10 training data. |
| Hardware Specification | No | The paper states 'Image Net models were Res Net-50 trained using the Cloud TPU codebase' but does not specify the exact model or version of the Cloud TPU (e.g., TPU v2, v3) or other hardware specifications for the experiments. |
| Software Dependencies | Yes | Models were trained using Python 2.7 and Tensor Flow 1.13 . |
| Experiment Setup | Yes | Experiments on CIFAR-10 used the WRN-28-2 model (Zagoruyko & Komodakis, 2016), trained for 78k steps with cosine learning rate decay. ... Experiments on Image Net used the Res Net-50 model (He et al., 2016), trained for 112.6k steps with a weight decay rate of 1e-4, and a learning rate of 0.2, which is decayed by 10 at epochs 30, 60, and 80. Batch size was set to be 1024. |