Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly

Authors: Qizhang Li, Yiwen Guo, Wangmeng Zuo, Hao Chen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Therefore, we establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods. In this paper, we evaluate and compare them comprehensively on 25 popular substitute/victim models on Image Net. New insights about the effectiveness of these methods are gained and guidelines for future evaluations are provided.
Researcher Affiliation Collaboration 1Harbin Institute of Technology, 2Tencent Security Big Data Lab, 3Independent Researcher, 4UC Davis
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Code at: https://github.com/qizhangli/TA-Bench. [...] A Unified Codebase. We offer an open-source codebase for TA-Bench, featuring a well-organized code structure that can effectively accommodate a diverse range of transfer-based attacks, as well as various substitute/victim models. It provides a unified setting for evaluations, ensuring consistency and reproducibility in experimental results. The code is at https://github.com/qizhangli/TA-Bench.
Open Datasets Yes All evaluations on our benchmark are conducted on Image Net [43]. [...] We randomly selected 5,000 benign examples that could be correctly classified by all the victim models, from the Image Net validation set, to craft adversarial examples.
Dataset Splits Yes We randomly selected 5,000 benign examples that could be correctly classified by all the victim models, from the Image Net validation set, to craft adversarial examples. [...] To ensure optimal performance across different substitute models, we employed a validation set consisting of 500 samples that were distinct from the test examples tune hyper-parameters of compared methods.
Hardware Specification Yes All experiments are performed on an NVIDIA V100 GPU.
Software Dependencies No The paper mentions using "timm [63] on Git Hub" for models but does not provide specific version numbers for timm or any other software library or dependency.
Experiment Setup Yes The optimization process of each compared method runs 100 iterations with a step size of 1/255 and 1 for ℓ constraint and ℓ2 constraint, respectively. [...] We adopted the default hyper-parameters for all combined methods and it is possible (yet computationally very intensive since the number of combinations is huge) to carefully tune hyper-parameters to achieve even better combinations. [...] The detailed hyper-parameters are reported in Section F.