Amata: An Annealing Mechanism for Adversarial Training Acceleration

Authors: Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu10691-10699

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Despite the empirical success in various domains, it has been revealed that deep neural networks are vulnerable to maliciously perturbed input data that much degrade their performance. This is known as adversarial attacks. To counter adversarial attacks, adversarial training formulated as a form of robust optimization has been demonstrated to be effective. However, conducting adversarial training brings much computational overhead compared with standard training. In order to reduce the computational cost, we propose an annealing mechanism, Amata, to reduce the overhead associated with adversarial training. The proposed Amata is provably convergent, well-motivated from the lens of optimal control theory and can be combined with existing acceleration methods to further enhance performance. It is demonstrated that on standard datasets, Amata can achieve similar or better robustness with around 1/3 to 1/2 the computational time compared with traditional methods.
Researcher Affiliation Academia Nanyang Ye,1 Qianxiao Li ,2, 5 Xiao-Yun Zhou, 3 Zhanxing Zhu * 4 1 Shanghai Jiao Tong University, Shanghai, China 2 National University of Singapore, Singapore 3 The Hamlyn Centre for Robotic Surgery, Imperial College, London, United Kingdom 4 Peking University, Beijing, China 5 Institute of High Performance Computing, A*STAR, Singapore
Pseudocode Yes Algorithm 1 An instantatiation of Amata for PGD
Open Source Code No The paper links to external, third-party code for Bayesian optimization, LeNet examples, YOPO, and ATTA (e.g., 'https://github.com/hyperopt/hyperopt', 'https://github.com/pytorch/examples/blob/master/mnist/main.py', 'https://github.com/a1600012888/YOPO-You-Only Propagate-Once/tree/82c5b902508224c642c8d0173e61435795c0 ac42/experiments/MNIST/YOPO-5-10', 'https://github.com/hzzheng93/ATTA'). However, it does not state that the authors' own implementation of Amata is open-source or provide a link to their specific code for this work.
Open Datasets Yes It is demonstrated that on standard datasets, Amata can achieve similar or better robustness with around 1/3 to 1/2 the computational time compared with traditional methods. The method is shown to be effective on benchmarks, including MNIST, CIFAR10, Caltech256, and the large-scale Image Net dataset. Experiments are conducted on MNIST(in the appendix), CIFAR10, Caltech256 and the large-scale Image Net dataset.
Dataset Splits No The paper mentions using standard datasets like MNIST, CIFAR10, Caltech256, and ImageNet, and refers to 'CIFAR10 validation'. However, it does not explicitly state the specific training/validation/test split percentages, sample counts, or the methodology (e.g., k-fold cross-validation, random seed) used for partitioning these datasets in their experiments.
Hardware Specification Yes In our experiment, Py Torch 1.0.0 and a single GTX 1080 Ti GPU were used for MNIST, CIFAR10, and Caltech256 experiment, while Py Torch 1.3.0 and four V100 GPUs were used for the Image Net experiment.
Software Dependencies Yes In our experiment, Py Torch 1.0.0 and a single GTX 1080 Ti GPU were used for MNIST, CIFAR10, and Caltech256 experiment, while Py Torch 1.3.0 and four V100 GPUs were used for the Image Net experiment.
Experiment Setup Yes Input: T: training epochs; Kmin/Kmax: the minimal/maximal number of adversarial perturbations; θ: parameter of neural network to be adversarially trained; B: mini-batch; α: step size for adversarial training; η: learning rate of neural network parameters. τ: constant, maximum perturbation:ϵ. We use Amata with the setting Kmin = 2 and Kmax = 10. We use Amata with the setting Kmin = 2 and Kmax = 5. Amata setting 1: Kmin = 5, Kmax = 40, Amata setting 2: Kmin = 10, Kmax = 40.