Tradeoffs in Data Augmentation: An Empirical Study

Authors: Raphael Gontijo-Lopes, Sylvia Smullin, Ekin Dogus Cubuk, Ethan Dyer

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Inspired by these, we conduct an empirical study to quantify how data augmentation improves model generalization. We present an empirical study of 204 different augmentations on CIFAR-10 and 225 on Image Net, varying both broad transform families and finer transform parameters.
Researcher Affiliation Industry Raphael Gontijo-Lopes Google Brain iraphael@google.com Sylvia J. Smullin Blueshift, Alphabet Ekin D. Cubuk Google Brain cubuk@google.com Ethan Dyer Blueshift, Alphabet edyer@google.com
Pseudocode No The paper describes methods in prose and with mathematical definitions, but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper states it used code based on an existing open-source project ('Cifar10 models were trained using code based on Auto Augment code2... 2available at github.com/tensorflow/models/tree/master/research/autoaugment'), but it does not explicitly state that the specific code for their described methodology (e.g., the Affinity and Diversity metrics' implementation or their experimental scripts) is open-source or provided.
Open Datasets Yes We present an empirical study of 204 different augmentations on CIFAR-10 and 225 on Image Net...
Dataset Splits Yes Validation set was the last 5000 samples of the shuffled CIFAR-10 training data.
Hardware Specification No The paper states 'Image Net models were Res Net-50 trained using the Cloud TPU codebase' but does not specify the exact model or version of the Cloud TPU (e.g., TPU v2, v3) or other hardware specifications for the experiments.
Software Dependencies Yes Models were trained using Python 2.7 and Tensor Flow 1.13 .
Experiment Setup Yes Experiments on CIFAR-10 used the WRN-28-2 model (Zagoruyko & Komodakis, 2016), trained for 78k steps with cosine learning rate decay. ... Experiments on Image Net used the Res Net-50 model (He et al., 2016), trained for 112.6k steps with a weight decay rate of 1e-4, and a learning rate of 0.2, which is decayed by 10 at epochs 30, 60, and 80. Batch size was set to be 1024.