On Adaptive Attacks to Adversarial Example Defenses

Authors: Florian Tramer, Nicholas Carlini, Wieland Brendel, Aleksander Madry

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that thirteen defenses recently published at ICLR, ICML and Neur IPS and which illustrate a diverse set of defense strategies can be circumvented despite attempting to perform evaluations using adaptive attacks. We find that we can circumvent all of them and substantially reduce the accuracy from what was originally claimed.
Researcher Affiliation Collaboration Florian Tramèr Stanford University tramer@cs.stanford.edu Nicholas Carlini Google nicholas@carlini.com Wieland Brendel University of Tübingen wieland.brendel@uni-tuebingen.de Aleksander M adry MIT madry@mit.edu
Pseudocode No The paper describes methods and steps in prose but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code Yes To promote reproducibility and encourage others to perform independent re-evaluations of proposed defenses, we release code for all of our attacks at https://github.com/wielandbrendel/adaptive_attacks_paper.
Open Datasets Yes This reduces accuracy from 50.0% to 0.16% for an adversarially trained CIFAR-10 model with k-WTA activations. This attack reduces the defense s accuracy to 1% on CIFAR-10 (ϵ = 8/255). This attack reduces the defense s accuracy to < 1% at a 0% detection rate on both CIFAR-10 and Image Net, for the same threat models as in the original paper.
Dataset Splits No The paper mentions using datasets like CIFAR-10 and ImageNet and discusses training and testing, but it does not explicitly detail the division of data into training, validation, and test sets. While these datasets often have standard splits, the paper does not specify the validation split used in its own experiments.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper does not specify version numbers for any software dependencies, programming languages, or libraries used in the experiments.
Experiment Setup Yes We estimate smoothed gradients via finite-differences from 20,000 small perturbations of the input, and run PGD for 100 steps. We found that multiplying the step size for PGD by three reduces accuracy from 48% to 26%. Increasing the number of steps to 250 further reduces accuracy to 10%.