Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Revisiting Character-level Adversarial Attacks for Language Models
Authors: Elias Abad Rocamora, Yongtao Wu, Fanghui Liu, Grigorios Chrysos, Volkan Cevher
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Specifically, on BERT with SST-2, Charmer improves the ASR in 4.84% points and the USE similarity in 8% points with respect to the previous art. Our implementation is available in github.com/LIONS-EPFL/Charmer. and 5. Experiments Our experiments are conducted in the publicly available3 Text Attack models (Morris et al., 2020b) and open-source large language models including Llama 2-Chat 7B (Touvron et al., 2023) and Vicuna 7B (Chiang et al., 2023). |
| Researcher Affiliation | Academia | 1LIONS, École Polytechnique Fédérale de Lausanne, Switzerland 2Department of Computer Science, University of Warwick, United Kingdom 3Department of Electrical and Computer Engineering, University of Wisconsin-Madison, USA. |
| Pseudocode | Yes | Algorithm 1 Heuristic for Top-n position selection. and Algorithm 2 Charmer Adversarial Attack |
| Open Source Code | Yes | Our implementation is available in github.com/LIONS-EPFL/Charmer. |
| Open Datasets | Yes | Our experiments are conducted in the publicly available3 Text Attack models (Morris et al., 2020b) and open-source large language models including Llama 2-Chat 7B (Touvron et al., 2023) and Vicuna 7B (Chiang et al., 2023). and All of our datasets are publicly available in https://huggingface.co/datasets. |
| Dataset Splits | Yes | If a test dataset is not available for a benchmark, we evaluate in the validation dataset, this is a standard practice (Morris et al., 2020b). |
| Hardware Specification | Yes | All of our experiments were conducted in a machine with a single NVIDIA A100 SXM4 GPU. |
| Software Dependencies | No | The paper mentions using Text Attack models and generally describes the experimental setup but does not specify software versions for libraries like Python, PyTorch, or other relevant dependencies. |
| Experiment Setup | Yes | For Charmer we use n = 20 positions (see Algorithm 1) and k = 10 except for AG-news where we use k = 20 because of the much longer sentences present in the dataset. Charmer-Fast simply takes n = 1 to speed-up the attack. |