On Adaptive Attacks to Adversarial Example Defenses
Authors: Florian Tramer, Nicholas Carlini, Wieland Brendel, Aleksander Madry
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that thirteen defenses recently published at ICLR, ICML and Neur IPS and which illustrate a diverse set of defense strategies can be circumvented despite attempting to perform evaluations using adaptive attacks. We find that we can circumvent all of them and substantially reduce the accuracy from what was originally claimed. |
| Researcher Affiliation | Collaboration | Florian Tramèr Stanford University tramer@cs.stanford.edu Nicholas Carlini Google nicholas@carlini.com Wieland Brendel University of Tübingen wieland.brendel@uni-tuebingen.de Aleksander M adry MIT madry@mit.edu |
| Pseudocode | No | The paper describes methods and steps in prose but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks. |
| Open Source Code | Yes | To promote reproducibility and encourage others to perform independent re-evaluations of proposed defenses, we release code for all of our attacks at https://github.com/wielandbrendel/adaptive_attacks_paper. |
| Open Datasets | Yes | This reduces accuracy from 50.0% to 0.16% for an adversarially trained CIFAR-10 model with k-WTA activations. This attack reduces the defense s accuracy to 1% on CIFAR-10 (ϵ = 8/255). This attack reduces the defense s accuracy to < 1% at a 0% detection rate on both CIFAR-10 and Image Net, for the same threat models as in the original paper. |
| Dataset Splits | No | The paper mentions using datasets like CIFAR-10 and ImageNet and discusses training and testing, but it does not explicitly detail the division of data into training, validation, and test sets. While these datasets often have standard splits, the paper does not specify the validation split used in its own experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies, programming languages, or libraries used in the experiments. |
| Experiment Setup | Yes | We estimate smoothed gradients via finite-differences from 20,000 small perturbations of the input, and run PGD for 100 steps. We found that multiplying the step size for PGD by three reduces accuracy from 48% to 26%. Increasing the number of steps to 250 further reduces accuracy to 10%. |