Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

RandAugment: Practical Automated Data Augmentation with a Reduced Search Space

Authors: Ekin Dogus Cubuk, Barret Zoph, Jon Shlens, Quoc Le

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Despite the simpliﬁcations, our method achieves state-of-the-art performance on CIFAR-10, SVHN, and Image Net. On Efﬁcient Net-B7, we achieve 84.7% accuracy, a 1.0% increase over baseline augmentation and a 0.4% improvement over Auto Augment on the Image Net dataset. On object detection, the same method used for classiﬁcation leads to 1.0-1.3% improvement over the baseline augmentation method on COCO.
Researcher Affiliation	Industry	Ekin D. Cubuk , Barret Zoph , Jonathon Shlens, Quoc V. Le Google Research, Brain Team EMAIL
Pseudocode	Yes	Figure 3: Python code for Rand Augment based on Num Py.
Open Source Code	Yes	Code is available online.
Open Datasets	Yes	CIFAR-10 [14], SVHN [24], and Image Net [4] classiﬁcation tasks.
Dataset Splits	Yes	N and M were selected based on the validation performance on 5K held out examples from the training set for 1 and 5 settings for N and M, respectively.
Hardware Specification	No	The paper mentions 'GPU hours' for Auto Augment's search, implying the use of GPUs, but does not provide specific models or other hardware details for their own experiments.
Software Dependencies	No	The paper mentions 'Num Py' and implies 'TensorFlow' through GitHub links, but no specific version numbers for any software dependencies are provided.
Experiment Setup	Yes	The Wide-Res Net models are all trained with K=14 data augmentations over a range of distortion magnitudes M parameterized on a uniform linear scale ranging from [0, 30]. Models are trained for 200 epochs on 45K training set examples. N and M were selected based on the validation performance on 5K held out examples from the training set for 1 and 5 settings for N and M, respectively.