Universal Adversarial Training

Authors: Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein5636-5643

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks. We propose a simple optimization-based universal attack that reduces the top-1 accuracy of various network architectures on Image Net to less than 20%, while learning the universal perturbation 13 faster than the standard method. To defend against these perturbations, we propose universal adversarial training, which models the problem of robust classifier generation as a two-player min-max game, and produces robust models with only 2 the cost of natural training. We also propose a simultaneous stochastic gradient method that is almost free of extra computation, which allows us to do universal adversarial training on Image Net.
Researcher Affiliation Academia Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein Deparment of Computer Science University of Maryland College Park, Maryland 20742 {ashafahi, najibi, xuzh, john, lsd,tomg}@cs.umd.edu
Pseudocode Yes Algorithm 1 Iterative solver for universal perturbations (Moosavi-Dezfooli et al. 2017b), Algorithm 2 Stochastic gradient for universal perturbation, Algorithm 3 Alternating stochastic gradient method for adversarial training against universal perturbation, Algorithm 4 Simultaneous stochastic gradient method for adversarial training against universal perturbation
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We test this method by attacking a naturally trained WRN 32-10 architecture on the CIFAR-10 dataset. and apply algorithm 2 to various popular architectures designed for classification on the Image Net dataset (Russakovsky et al. 2015).
Dataset Splits Yes We use 5000 training samples from CIFAR-10 for constructing the universal adversarial perturbation for naturally trained WRN model from (Madry et al. 2018). and The accuracy reported is the classification accuracy on the entire validation set of Image Net after adding the universal perturbation. and The accuracy of the universal perturbations on the validation examples are summarized in table 3.
Hardware Specification No The paper only generally mentions 'accelerates computation on a GPU' without providing specific details like GPU models (e.g., NVIDIA A100, RTX 2080 Ti), CPU models, or memory specifications.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup Yes In our CIFAR experiments, we use ϵ = 8, batchsize of 128, and we train for 80,000 steps. For the optimizer, we use Momentum SGD with an initial learning rate of 0.1 which drops to 0.01 at iteration 40,000 and drops further down to 0.001 at iteration 60,000. and For Image Net, we again use fairly standard training parameters (90 epochs, batch-size 256).