LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack
Authors: Hai Zhu, Qingyang Zhao, Weiwei Shang, Yuren Wu, Kai Liu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that Lime Attack achieves the better attacking performance compared with existing hard-label attack under the same query budget. |
| Researcher Affiliation | Collaboration | Hai Zhu1 3*, Qingyang Zhao2, Weiwei Shang1, Yuren Wu3, Kai Liu4 1 University of Science and Technology of China 2 Xidian University 3Ping An Technology 4Lazada |
| Pseudocode | No | The paper describes the algorithm steps in paragraph form and includes a flowchart, but it does not feature a formally labeled or structured pseudocode block. |
| Open Source Code | Yes | 1Code is available in https://github.com/zhuhai-ustc/limeattack |
| Open Datasets | Yes | We adopt seven common datasets, such as MR (Pang and Lee 2005), SST-2 (Socher et al. 2013), AG (Zhang, Zhao, and Le Cun 2015) and Yahoo (Yoo et al. 2020) for text classification. SNLI (Bowman et al. 2015) and MNLI (Williams, Nangia, and Bowman 2018) for textual entailment |
| Dataset Splits | No | The paper mentions using common datasets and sampling 1000 texts for attack, but it does not provide specific percentages or counts for training, validation, or test dataset splits to allow full reproducibility of data partitioning. |
| Hardware Specification | No | The paper discusses the experimental setup and training procedures but does not specify any particular hardware components such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions software like NLTK and Universal Sentence Encoder, but it does not provide specific version numbers for these or other software dependencies required to replicate the experimental environment. |
| Experiment Setup | Yes | We set the kernel width σ = 25, the number of neighborhood samples equal to the number of the benign sample s tokens, and the beam size b = 10. For a fair comparison, all baselines follow the same settings: synonyms are selected from counter-fitted embedding space and the number of each candidate set k = 50, the same 1000 texts are sampled for baselines to attack. The results are averaged on five runs with different seeds (1234,2234,3234,4234 and 5234) to eliminate randomness. In order to improve the quality of adversarial examples, the attack succeeds if the perturbation rate of each adversarial example is less than 10%. We set a tiny query budget of 100 for hard-label attack, which corresponds to real-world settings. |