Fool Your (Vision and) Language Model with Embarrassingly Simple Permutations
Authors: Yongshuo Zong, Tingyang Yu, Ruchika Chavhan, Bingchen Zhao, Timothy Hospedales
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Specifically, we show empirically that popular models are vulnerable to adversarial permutation in answer sets for multiple-choice prompting, which is surprising as models should ideally be as invariant to prompt permutation as humans are. |
| Researcher Affiliation | Academia | 1University of Einburgh 2EPFL. |
| Pseudocode | No | The paper provides a mathematical equation for the attack but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/ ys-zong/Foolyour VLLMs. |
| Open Datasets | Yes | All of the datasets we use are publicly available. Specifically, for LLMs, we utilize MMLU (Hendrycks et al., 2020), ARC challenge (ARC-c) (Clark et al., 2018), Bool Q (Clark et al., 2019), Sociali QA (Sap et al., 2019), and Med MCQA (Pal et al., 2022). For VLLMs, we use Science QA (Lu et al., 2022), A-OKVQA (Schwenk et al., 2022), MMBench (Liu et al., 2023c), and SEED-Bench (Li et al., 2023a). |
| Dataset Splits | Yes | Specifically, for LLMs, we utilize MMLU (Hendrycks et al., 2020), ARC challenge (ARC-c) (Clark et al., 2018), Bool Q (Clark et al., 2019), Sociali QA (Sap et al., 2019), and Med MCQA (Pal et al., 2022). For VLLMs, we use Science QA (Lu et al., 2022), A-OKVQA (Schwenk et al., 2022), MMBench (Liu et al., 2023c), and SEED-Bench (Li et al., 2023a). ... As many of the benchmarks do not provide a training set, we conduct two fine-tuning experiments using Llama2-7B on two datasets that do provide training sets: ARC-Challenge (Clark et al., 2018) and Med MCQA (Pal et al., 2022). |
| Hardware Specification | Yes | Experiments are conducted on A100-80GB GPUs. |
| Software Dependencies | No | The paper mentions accessing model weights from Hugging Face or official repositories but does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For both LLMs and VLLMs, we use greedy decoding to ensure reproducibility. ... We fine-tune with Lo RA (Hu et al., 2022) for 1 epoch. |