Delving into Transferable Adversarial Examples and Black-box Attacks
Authors: Yanpei Liu, Xinyun Chen, Chang Liu, Dawn Song
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we are the first to conduct an extensive study of the transferability over large models and a large scale dataset, and we are also the first to study the transferability of targeted adversarial examples with their target labels. We study both non-targeted and targeted adversarial examples, and show that while transferable non-targeted adversarial examples are easy to find, targeted adversarial examples generated using existing approaches almost never transfer with their target labels. Therefore, we propose novel ensemble-based approaches to generating transferable adversarial examples. Using such approaches, we observe a large proportion of targeted adversarial examples that are able to transfer with their target labels for the first time. We also present some geometric studies to help understanding the transferable adversarial examples. Finally, we show that the adversarial examples generated using ensemble-based approaches can successfully attack Clarifai.com, which is a black-box image classification system. |
| Researcher Affiliation | Academia | Yanpei Liu , Xinyun Chen Shanghai Jiao Tong University Chang Liu, Dawn Song University of the California, Berkeley |
| Pseudocode | No | The paper describes methods using mathematical formulations and textual explanations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides links to pre-trained models (e.g., ResNet, GoogLeNet) and to a GitHub repository containing images and target labels used in their evaluation ('https://github.com/sunblaze-ucb/transferability-advdnn-pub'). However, it does not provide an explicit statement about releasing the source code for their own implemented adversarial generation methods or ensemble-based approaches, nor does it link to a repository containing that specific code. |
| Open Datasets | Yes | For the rest of the paper, we focus on examining the transferability among state-of-the-art models trained over Image Net (Russakovsky et al. (2015)). |
| Dataset Splits | No | The paper mentions that the models examined are trained over ImageNet and refers to the ILSVRC 2012 validation set from which they selected 100 images for their test set. However, for their own experiments, they use pre-trained models and do not define explicit training or validation splits for their specific adversarial generation and testing process. They only define their test set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software like 'Adam Optimizer (Kingma & Ba (2014))' and 'Caffe' (implied by a link for GoogLeNet), but it does not specify version numbers for any of the software dependencies, which are necessary for reproducible descriptions. |
| Experiment Setup | Yes | We run Adam Optimizer for 100 iterations to generate the adversarial images. ... In particular, we set the learning rate to be 4. ... In particular, we set the learning rate of Adam to be 8 for each model. In each iteration, we compute the Adam update for each model, sum up the four updates, and add the aggregation onto the image. We run 100 iterations of updates, and we observe that the loss converges after 100 iterations. |