Black-box Adversarial Attacks with Limited Queries and Information

Authors: Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the methods proposed in Section 2 on their effectiveness in producing targeted adversarial examples in the three threat models we consider: query-limited, partialinformation, and label-only. First, we present our evaluation methodology. Then, we present evaluation results for our three attacks. Finally, we demonstrate an attack against a commercial system: the Google Cloud Vision (GCV) classifier. Table 1 summarizes evaluation results our attacks for the three different threat models we consider, and Figure 2 shows the distribution of the number of queries. Figure 3 shows a sample of the adversarial examples we produced.
Researcher Affiliation Collaboration 1Massachusetts Institute of Technology 2Lab Six. Correspondence to: Lab Six <team@labsix.org>.
Pseudocode Yes Algorithm 1 NES Gradient Estimate; Algorithm 2 Partial Information Attack
Open Source Code Yes We have released full source code for the attacks we describe 4. https://github.com/labsix/limitedblackbox-attacks
Open Datasets Yes We evaluate the effectiveness of our attacks against an Image Net classifier. We use a pre-trained Inception V3 network (Szegedy et al., 2015) that has 78% top-1 accuracy
Dataset Splits No The paper mentions using the Image Net test set but does not provide explicit training, validation, or specific split percentages/counts for reproducibility of data partitioning. It uses a 'pre-trained Inception V3 network' but does not detail the splits used for that training.
Hardware Specification No The paper mentions 'compute resources' in the acknowledgements but does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No No specific ancillary software details, such as library names with version numbers (e.g., Python, PyTorch, TensorFlow versions), are provided for replication.
Experiment Setup Yes Table 2. Hyperparameters used for evaluation σ for NES 0.001 n, size of each NES population 50 ϵ, l distance to the original image 0.05 η, learning rate 0.01 Partial-Information Attack ϵ0, initial distance from source image 0.5 δϵ, rate at which to decay ϵ 0.001 Label-Only Attack m, number of samples for proxy score 50 µ, ℓ radius of sampling ball 0.001