Fast AutoAugment

Authors: Sungbin Lim, Ildoo Kim, Taesup Kim, Chiheon Kim, Sungwoong Kim

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that the proposed method can search augmentation policies significantly faster than Auto Augment (see Table 1), while retaining comparable performances to Auto Augment on diverse image datasets and networks, especially in two use cases: (a) direct augmentation search on the dataset of interest, (b) transferring learned augmentation policies to new datasets.
Researcher Affiliation Collaboration Sungbin Lim UNIST sungbin@unist.ac.kr Ildoo Kim Kakao Brain ildoo.kim@kakaobrain.com Taesup Kim MILA, Université de Montréal, Canada taesup.kim@umontreal.ca Chiheon Kim Kakao Brain chiheon.kim@kakaobrain.com Sungwoong Kim Kakao Brain swkim@kakaobrain.com
Pseudocode Yes Algorithm 1: Fast Auto Augment
Open Source Code Yes Our code is open to the public by the official Git Hub3 of Kakao Brain. 3https://github.com/kakaobrain/fast-autoaugment
Open Datasets Yes CIFAR-10, CIFAR-100 [20], and Image Net [6] datasets. We conducted an experiment with the SVHN dataset [25]
Dataset Splits Yes For any given pair of Dtrain and Dvalid, our goal is to improve the generalization ability by searching the augmentation policies that match the density of Dtrain with density of augmented Dvalid. Let us split Dtrain = DM DA into DM and DA that are used for learning the model parameter θ and exploring the augmentation policy T , respectively. Validation set Top-1 / Top-5 error rate (%) on Image Net.
Hardware Specification Yes We estimate computation cost with an NVIDIA Tesla V100 while Auto Augment measured computation cost in Tesla P100. less than 5 hours on CIFAR-10/100 with WRes Net28x10 and a single V100 GPU.
Software Dependencies No We utilize Ray [24] to implement Fast Auto Augment, which enables us to train models and search policies in a distributed manner. We use Hyper Opt library from Ray with B search numbers and 20 maximum concurrent evaluations. We split training sets while preserving the percentage of samples for each class (stratified shuffling) using Stratified Shuffle Split method in sklearn [27]. No specific version numbers are provided.
Experiment Setup Yes We utilize 5-folds stratified shuffling (K = 5), 2 search width (T = 2), 200 search depth (B = 200), and 10 selected policies (N = 10) for policy evaluation. Res Net-50 [12] on each fold were trained for 90 epochs during policy search phase. Each sub-policy consists of two operations (Nτ = 2), each policy consists of five sub-policies (NT = 5).