Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?

Authors: Yonggan Fu, Shunyao Zhang, Shang Wu, Cheng Wan, Yingyan Lin

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we ask an intriguing question: Under what kinds of perturbations do Vi Ts become more vulnerable learners compared to CNNs? Driven by this question, we first conduct a comprehensive experiment regarding the robustness of both Vi Ts and CNNs under various existing adversarial attacks to understand the underlying reason favoring their robustness. Based on the drawn insights, we then propose a dedicated attack framework, dubbed Patch-Fool, that fools the self-attention mechanism by attacking its basic component (i.e., a single patch) with a series of attention-aware optimization techniques. Interestingly, our Patch-Fool framework shows for the first time that Vi Ts are not necessarily more robust than CNNs against adversarial perturbations. In particular, we find that Vi Ts are more vulnerable learners compared with CNNs against our Patch-Fool attack which is consistent across extensive experiments, and the observations from Sparse/Mild Patch-Fool, two variants of Patch-Fool, indicate an intriguing insight that the perturbation density and strength on each patch seem to be the key factors that influence the robustness ranking between Vi Ts and CNNs. It can be expected that our Patch-Fool framework will shed light on both future architecture designs and training schemes for robustifying Vi Ts towards their real-world deployment. Our codes are available at https://github.com/RICE-EIC/Patch-Fool.
Researcher Affiliation Academia Yonggan Fu , Shunyao Zhang , Shang Wu , Cheng Wan & Yingyan Lin Department of Electrical and Computer Engineering, Rice University {yf22, sz74, sw99, chwan, yingyan.lin}@rice.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our codes are available at https://github.com/RICE-EIC/Patch-Fool.
Open Datasets Yes Models and datasets. We mainly benchmark the robustness of the Dei T (Touvron et al., 2021) family with the Res Net (He et al., 2016) family, using their official pretrained models. Note that we adopt Dei T models without distillation for a fair comparison. We randomly select 2500 images from the validation set of Image Net for evaluating robustness, following (Bhojanapalli et al., 2021).
Dataset Splits Yes We randomly select 2500 images from the validation set of Image Net for evaluating robustness, following (Bhojanapalli et al., 2021).
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU models, or cloud computing instance types used for running experiments.
Software Dependencies No For the CW-L and CW-L2 attacks, we adopt the implementation in Adver Torch (Ding et al., 2019) and the same settings as (Chen et al., 2021a; Rony et al., 2019); For Auto Attack, we adopt the official implementation and default settings in (Croce & Hein, 2020). The paper mentions AdverTorch and Auto Attack but does not provide specific version numbers for these or other software components.
Experiment Setup Yes Patch-Fool settings. The weight coefficient α in Eq. 4 is set as 0.002. The step size η in Eq. 7 is initialized to be 0.2 and decayed by 0.95 every 10 iterations, and the number of total iterations is 250. For evaluating Patch-Fool with different perturbation strengths, we allow Patch-Fool to attack up to four patches based on the attention-aware patch selection in Sec. 3.4, i.e., the patches with top importance scores defined in Eq. 2 will be selected.