Universal Adversarial Training
Authors: Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein5636-5643
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks. We propose a simple optimization-based universal attack that reduces the top-1 accuracy of various network architectures on Image Net to less than 20%, while learning the universal perturbation 13 faster than the standard method. To defend against these perturbations, we propose universal adversarial training, which models the problem of robust classifier generation as a two-player min-max game, and produces robust models with only 2 the cost of natural training. We also propose a simultaneous stochastic gradient method that is almost free of extra computation, which allows us to do universal adversarial training on Image Net. |
| Researcher Affiliation | Academia | Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein Deparment of Computer Science University of Maryland College Park, Maryland 20742 {ashafahi, najibi, xuzh, john, lsd,tomg}@cs.umd.edu |
| Pseudocode | Yes | Algorithm 1 Iterative solver for universal perturbations (Moosavi-Dezfooli et al. 2017b), Algorithm 2 Stochastic gradient for universal perturbation, Algorithm 3 Alternating stochastic gradient method for adversarial training against universal perturbation, Algorithm 4 Simultaneous stochastic gradient method for adversarial training against universal perturbation |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We test this method by attacking a naturally trained WRN 32-10 architecture on the CIFAR-10 dataset. and apply algorithm 2 to various popular architectures designed for classification on the Image Net dataset (Russakovsky et al. 2015). |
| Dataset Splits | Yes | We use 5000 training samples from CIFAR-10 for constructing the universal adversarial perturbation for naturally trained WRN model from (Madry et al. 2018). and The accuracy reported is the classification accuracy on the entire validation set of Image Net after adding the universal perturbation. and The accuracy of the universal perturbations on the validation examples are summarized in table 3. |
| Hardware Specification | No | The paper only generally mentions 'accelerates computation on a GPU' without providing specific details like GPU models (e.g., NVIDIA A100, RTX 2080 Ti), CPU models, or memory specifications. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | In our CIFAR experiments, we use ϵ = 8, batchsize of 128, and we train for 80,000 steps. For the optimizer, we use Momentum SGD with an initial learning rate of 0.1 which drops to 0.01 at iteration 40,000 and drops further down to 0.001 at iteration 60,000. and For Image Net, we again use fairly standard training parameters (90 epochs, batch-size 256). |