Improving Auto-Augment via Augmentation-Wise Weight Sharing

Authors: Keyu Tian, Chen Lin, Ming Sun, Luping Zhou, Junjie Yan, Wanli Ouyang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive analysis verifies the superiority of this approach in terms of effectiveness and efficiency. The augmentation policies found by our method achieve superior accuracies compared with existing auto-augmentation search methods. On CIFAR-10, we achieve a top-1 error rate of 1.24%, which is currently the best performing single model without extra training data. On Image Net, we get a top-1 error rate of 20.36% for Res Net-50, which leads to 3.34% absolute error rate reduction over the baseline augmentation. Following the literature on automatic augmentation, we evaluate the performance of our proposed method on three classification datasets: CIFAR-10 [15], CIFAR-100 [15], and Image Net [6].
Researcher Affiliation Collaboration Keyu Tian Sense Time Research Beihang University tiankeyu.00@gmail.com Chen Lin Sense Time Research linchen@sensetime.com Ming Sun Sense Time Research sunming1@sensetime.com Luping Zhou University of Sydney luping.zhou@sydney.edu.au Junjie Yan Sense Time Research yanjunjie@sensetime.com Wanli Ouyang University of Sydney wanli.ouyang@sydney.edu.au
Pseudocode Yes Algorithm 1 AWS Auto-Aug Search
Open Source Code No The paper states: "The augmentation policies we found on both CIFAR and Image Net benchmark will be released to the public as an off-the-shelf augmentation policy to push the boundary of the state-of-the-art performance." This refers to releasing the *policies found* by their method, not the *source code implementing their AWS Auto-Aug methodology* itself, nor is a direct link provided for the code.
Open Datasets Yes Following the literature on automatic augmentation, we evaluate the performance of our proposed method on three classification datasets: CIFAR-10 [15], CIFAR-100 [15], and Image Net [6].
Dataset Splits Yes We train Res Net-18 [12] on CIFAR-10 [15] for 300 epochs in total... The detailed description and splitting ways of these datasets are presented in the supplementary material. Use ACC( ω θ, Dval) to update θ;
Hardware Specification Yes We estimate ours with Tesla V100.
Software Dependencies No The paper mentions the use of an "Adam optimizer" and "Proximal Policy Optimization [31]" but does not specify version numbers for these or any other software components (e.g., programming language, deep learning frameworks, or specific library versions).
Experiment Setup Yes The numbers of epochs of each part are set to 200 and 10, respectively, leading to 210 total number of epochs in the search process. The Tmax is set to 500. To optimize the policy, we use the Adam optimizer with a learning rate of ηθ = 0.1, β1 = 0.5 and β2 = 0.999. The learning rate ηθ is set to 0.2. The numbers of epochs of the two training stages are set to 150 and 5, respectively.