Towards Transferable Adversarial Attacks on Vision Transformers

Authors: Zhipeng Wei, Jingjing Chen, Micah Goldblum, Zuxuan Wu, Tom Goldstein, Yu-Gang Jiang2668-2676

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the transferability of attacks on state-of-the-art Vi Ts, CNNs and robustly trained CNNs. The results of these experiments demonstrate that the proposed dual attack can greatly boost transferability between Vi Ts and from Vi Ts to CNNs. In addition, the proposed method can easily be combined with existing transfer methods to boost performance.
Researcher Affiliation Academia 1Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan University 2Shanghai Collaborative Innovation Center on Intelligent Visual Computing 3Department of Computer Science, University of Maryland
Pseudocode Yes Algorithm 1: The dual attack on Vi Ts Input: The loss function J of Equation 7, a white-box model f, a clean image x with its ground-truth class y. Parameter: The perturbation budget ϵ, iteration number I, used patch number T. Output: The adversarial example. 1: δ0 0 2: α ϵ I 3: for i = 0 to I 1 do 4: xs Patch Out(xp, T) 5: M Equation 6 6: g PNA( δJ with the L2 norm) 7: δi clipϵ(δi 1 + α g) 8: end for 9: xadv = x + δI 10: return xadv
Open Source Code Yes Code is available at https://github.com/zhipeng-wei/PNA-Patch Out.
Open Datasets Yes We further conduct extensive experiments on the Image Net dataset (Russakovsky et al. 2015).
Dataset Splits Yes Following the setting from Dong et al. (2018, 2019); Xie et al. (2019); Lin et al. (2019), we randomly sample one image, which is correctly classified by all models, from each class from the Image Net 2012 validation dataset (Russakovsky et al. 2015), to conduct our experiments.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. It does not mention any specific hardware setup.
Software Dependencies No The paper mentions 'timm library' and implicitly PyTorch, but does not provide specific version numbers for these or any other software dependencies. The reference to 'Py Torch Image Models' does not specify a version used by the authors.
Experiment Setup Yes Following Dong et al. (2018, 2019); Xie et al. (2019), we set the norm constraint ϵ = 16 and J as the cross-entropy loss function. For the iterative attack, we set I = 10 and thus the step size α = 1.6. We resize all images to 224 224 to conduct experiments. For the inputs of Vi Ts, we set the patch size P = 16, thus the number of the patches is N = 196. We set our patch number T = 130, and we set the balancing parameter λ = 0.1 in Patch Out according to the experimental results.