Automated Discovery of Adaptive Attacks on Adversarial Defenses

Authors: Chengyuan Yao, Pavol Bielik, Petar Tsankov, Martin Vechev

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated our approach on 24 adversarial defenses and show that it outperforms Auto Attack (Croce & Hein, 2020b), the current state-of-the-art tool for reliable evaluation of adversarial defenses: our tool discovered significantly stronger attacks by producing 3.0%-50.8% additional adversarial examples for 10 models, while obtaining attacks with slightly stronger or similar strength for the remaining models.
Researcher Affiliation Collaboration Chengyuan Yao Department of Computer Science ETH Z urich, Switzerland chengyuan.yao@inf.ethz.ch Pavol Bielik Lattice Flow Switzerland pavol@latticeflow.ai Petar Tsankov Lattice Flow Switzerland petar@latticeflow.ai Martin Vechev Department of Computer Science ETH Z urich, Switzerland martin.vechev@inf.ethz.ch
Pseudocode Yes Algorithm 1: A search algorithm that given a model f with unknown defense, discovers an adaptive attack from the attack search space A with the best score.
Open Source Code Yes Our tool A3 and scripts for reproducing the experiments are available online at: https://github.com/eth-sri/adaptive-auto-attack
Open Datasets Yes We use D = {(xi, yi)}N i=1 to denote a training dataset where x X is a natural input (e.g., an image) and y is the corresponding label. CIFAR-10, l , ϵ = 4/255
Dataset Splits No The paper describes how its search algorithm uses subsets of the dataset for evaluation and optimization (e.g., 'initial dataset size n = 100', 'Successive halving (SHA)'), but it does not specify a formal train/validation/test split for the overall datasets used (e.g., CIFAR-10) for model training or evaluation in a reproducible manner.
Hardware Specification Yes All of the experiments are performed using a single RTX 2080 Ti GPU.
Software Dependencies Yes The implementation of A3 is based on Py Torch (Paszke et al., 2019), the implementations of FGSM, PGD, NES, and Deep Fool are based on Fool Box (Rauber et al., 2017) version 3.0.0, C&W is based on ART (Nicolae et al., 2018) version 1.3.0, and the attacks APGD, FAB, and SQR are from (Croce & Hein, 2020b).
Experiment Setup Yes We instantiate Algorithm 1 by setting: the attack sequence length m = 3, the number of trials k = 64, the initial dataset size n = 100, and we use a time budget of 0.5 to 3 seconds per sample depending on the model size.