Fast AutoAugment
Authors: Sungbin Lim, Ildoo Kim, Taesup Kim, Chiheon Kim, Sungwoong Kim
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the proposed method can search augmentation policies significantly faster than Auto Augment (see Table 1), while retaining comparable performances to Auto Augment on diverse image datasets and networks, especially in two use cases: (a) direct augmentation search on the dataset of interest, (b) transferring learned augmentation policies to new datasets. |
| Researcher Affiliation | Collaboration | Sungbin Lim UNIST sungbin@unist.ac.kr Ildoo Kim Kakao Brain ildoo.kim@kakaobrain.com Taesup Kim MILA, Université de Montréal, Canada taesup.kim@umontreal.ca Chiheon Kim Kakao Brain chiheon.kim@kakaobrain.com Sungwoong Kim Kakao Brain swkim@kakaobrain.com |
| Pseudocode | Yes | Algorithm 1: Fast Auto Augment |
| Open Source Code | Yes | Our code is open to the public by the official Git Hub3 of Kakao Brain. 3https://github.com/kakaobrain/fast-autoaugment |
| Open Datasets | Yes | CIFAR-10, CIFAR-100 [20], and Image Net [6] datasets. We conducted an experiment with the SVHN dataset [25] |
| Dataset Splits | Yes | For any given pair of Dtrain and Dvalid, our goal is to improve the generalization ability by searching the augmentation policies that match the density of Dtrain with density of augmented Dvalid. Let us split Dtrain = DM DA into DM and DA that are used for learning the model parameter θ and exploring the augmentation policy T , respectively. Validation set Top-1 / Top-5 error rate (%) on Image Net. |
| Hardware Specification | Yes | We estimate computation cost with an NVIDIA Tesla V100 while Auto Augment measured computation cost in Tesla P100. less than 5 hours on CIFAR-10/100 with WRes Net28x10 and a single V100 GPU. |
| Software Dependencies | No | We utilize Ray [24] to implement Fast Auto Augment, which enables us to train models and search policies in a distributed manner. We use Hyper Opt library from Ray with B search numbers and 20 maximum concurrent evaluations. We split training sets while preserving the percentage of samples for each class (stratified shuffling) using Stratified Shuffle Split method in sklearn [27]. No specific version numbers are provided. |
| Experiment Setup | Yes | We utilize 5-folds stratified shuffling (K = 5), 2 search width (T = 2), 200 search depth (B = 200), and 10 selected policies (N = 10) for policy evaluation. Res Net-50 [12] on each fold were trained for 90 epochs during policy search phase. Each sub-policy consists of two operations (Nτ = 2), each policy consists of five sub-policies (NT = 5). |