reproducibilityindex.ai

AdaAug: Learning Class- and Instance-adaptive Data Augmentation Policies

Authors: Tsz-Him Cheung, Dit-Yan Yeung

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the adaptive augmentation policies learned by our method transfer well to unseen datasets such as the Oxford Flowers, Oxford-IIT Pets, FGVC Aircraft, and Stanford Cars datasets when compared with other Auto DA baselines. In addition, our method also achieves a state-of-the-art performance on the CIFAR-10, CIFAR-100, and SVHN datasets.1
Researcher Affiliation	Academia	Tsz-Him Cheung & Dit-Yan Yeung Department of Computer Science and Engineering The Hong Kong University of Science and Technology {thcheungae, dyyeung}@cse.ust.hk
Pseudocode	Yes	Algorithm 1 Search algorithm
Open Source Code	Yes	Code is available at https://github.com/jamestszhim/adaptive_augment
Open Datasets	Yes	We search for the optimal augmentation policy on the CIFAR-100 dataset and use the learned policy to train with four ﬁne-grained classiﬁcation datasets: Oxford 102 Flowers (Nilsback & Zisserman, 2008), Oxford-IIIT Pets (Em et al., 2017), FGVC Aircraft (Maji et al., 2013), and Stanford Cars (Krause et al., 2013). We compare Ada Aug-direct with state-of-the-art Auto DA methods using the same evaluation datasets: CIFAR10, CIFAR-100 (Krizhevsky & Hinton, 2009), and SVHN (Netzer et al., 2011).
Dataset Splits	Yes	We follow the setup adopted by Auto Augment (Cubuk et al., 2019) to use 4,000 training images for CIFAR-10 and CIFAR-100, and 1,000 training images for SVHN. The remaining images are used as the validation set.
Hardware Specification	Yes	Ada Aug takes only 3.3 GPU hours on an old Ge Force GTX 1080 GPU card (see Appendix A.4).
Software Dependencies	No	The paper mentions using the 'Adam optimizer' but does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used.
Experiment Setup	Yes	We implement h as a linear layer and update the policy parameter γ after every 10 training steps using the Adam optimizer with a learning rate of 0.001 and a batch size of 128. We use the cosine learning rate decay with one annealing cycle (Loshchilov & Hutter, 2017), initial learning rate of 0.1, weight decay 1e-4 and gradient clipping parameter 5.