Adversarial AutoAugment
Authors: Xinyu Zhang, Qiang Wang, Jian Zhang, Zhao Zhong
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show experimental results of our approach on CIFAR-10/CIFAR-100, Image Net, and demonstrate significant performance improvements over state-of-the-art. On CIFAR-10, we achieve a top-1 test error of 1.36%, which is the currently best performing single model. On Image Net, we achieve a leading performance of top-1 accuracy 79.40% on Res Net-50 and 80.00% on Res Net-50-D without extra data. |
| Researcher Affiliation | Industry | Xinyu Zhang Qiang Wang Huawei Huawei zhangxinyu10@huawei.com wangqiang168@huawei.com Jian Zhang Zhao Zhong Huawei Huawei zhangjian157@huawei.com zorro.zhongzhao@huawei.com |
| Pseudocode | Yes | Algorithm 1 Joint Training of Target Network and Augmentation Policy Network |
| Open Source Code | No | The paper does not include any explicit statements about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | CIFAR-10 dataset (Krizhevsky & Hinton, 2009) has totally 60000 images. The training and test sets have 50000 and 10000 images, respectively. Each image in size of 32 32 belongs to one of 10 classes. |
| Dataset Splits | Yes | As a great challenge in image recognition, Image Net dataset (Deng et al., 2009) has about 1.2 million training images and 50000 validation images with 1000 classes. |
| Hardware Specification | Yes | In Table 5, we take the training of Res Net-50 on Image Net as an example to compare the computing cost and time overhead of our method and Auto Augment. From the table, we can find that our method is 12 less computing cost and 11 shorter time overhead than Auto Augment. The computing cost and time overhead are estimated on 64 NVIDIA Tesla V100s. |
| Software Dependencies | No | The RNN controller is implemented as a one-layer LSTM (Hochreiter & Schmidhuber, 1997). We use Adam optimizer (Kingma & Ba, 2015) with a initial learning rate 0.00035 to train the controller. While specific components are mentioned, no version numbers for programming languages, libraries, or frameworks (e.g., Python, TensorFlow, PyTorch) are provided. |
| Experiment Setup | Yes | The RNN controller is implemented as a one-layer LSTM (Hochreiter & Schmidhuber, 1997). We set the hidden size to 100, and the embedding size to 32. We use Adam optimizer (Kingma & Ba, 2015) with a initial learning rate 0.00035 to train the controller. To avoid unexpected rapid convergence, an entropy penalty of a weight of 0.00001 is applied. All the reported results are the mean of five runs with different initializations. |