Imperio: Language-Guided Backdoor Attacks for Arbitrary Model Control
Authors: Ka-Ho Chow, Wenqi Wei, Lei Yu
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments across three datasets, five attacks, and nine defenses confirm Imperio s effectiveness. |
| Researcher Affiliation | Academia | 1The University of Hong Kong 2Fordham University 3Rensselaer Polytechnic Institute |
| Pseudocode | No | The paper describes algorithms and methods in prose but does not contain any formally labeled pseudocode blocks or algorithm figures. |
| Open Source Code | Yes | To support further research, we open-source Imperio and our pretrained models. Supplementary materials are available at https://khchow.com/Imperio. |
| Open Datasets | Yes | We conduct experiments on three datasets and various architectures for the victim classifier: a CNN model for Fashion MNIST (FMNIST), a Preactivation Res Net18 model for CIFAR10, and a Res Net18 model for Tiny Image Net (TImage Net). |
| Dataset Splits | No | The paper describes training and mentions batch size and epochs, but does not explicitly provide specific train/validation/test dataset splits with percentages, sample counts, or a detailed splitting methodology. |
| Hardware Specification | No | The paper discusses model architectures (e.g., CNN, Res Net18, Llama-2) and training configurations but does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions the use of 'Llama-2-13b-chat' as the LLM and 'SGD' as the optimizer, and discusses other LLMs like BERT, RoBERTa, and FLAN-T5. However, it does not provide specific version numbers for programming languages, libraries, or frameworks used (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | Hyperparameters. The training lasts for 100 epochs for FMNIST and 500 epochs for CIFAR10 and TImage Net. For all datasets, we use SGD as the optimizer, with 0.01 as the initial learning rate. The batch size is 512, where the fraction of poisoned samples is p = 0.10. Following [Doan et al., 2022], the maximum change to the clean image is ϵ = 0.05. |