Imperio: Language-Guided Backdoor Attacks for Arbitrary Model Control

Authors: Ka-Ho Chow, Wenqi Wei, Lei Yu

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments across three datasets, five attacks, and nine defenses confirm Imperio s effectiveness.
Researcher Affiliation Academia 1The University of Hong Kong 2Fordham University 3Rensselaer Polytechnic Institute
Pseudocode No The paper describes algorithms and methods in prose but does not contain any formally labeled pseudocode blocks or algorithm figures.
Open Source Code Yes To support further research, we open-source Imperio and our pretrained models. Supplementary materials are available at https://khchow.com/Imperio.
Open Datasets Yes We conduct experiments on three datasets and various architectures for the victim classifier: a CNN model for Fashion MNIST (FMNIST), a Preactivation Res Net18 model for CIFAR10, and a Res Net18 model for Tiny Image Net (TImage Net).
Dataset Splits No The paper describes training and mentions batch size and epochs, but does not explicitly provide specific train/validation/test dataset splits with percentages, sample counts, or a detailed splitting methodology.
Hardware Specification No The paper discusses model architectures (e.g., CNN, Res Net18, Llama-2) and training configurations but does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions the use of 'Llama-2-13b-chat' as the LLM and 'SGD' as the optimizer, and discusses other LLMs like BERT, RoBERTa, and FLAN-T5. However, it does not provide specific version numbers for programming languages, libraries, or frameworks used (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes Hyperparameters. The training lasts for 100 epochs for FMNIST and 500 epochs for CIFAR10 and TImage Net. For all datasets, we use SGD as the optimizer, with 0.01 as the initial learning rate. The batch size is 512, where the fraction of poisoned samples is p = 0.10. Following [Doan et al., 2022], the maximum change to the clean image is ϵ = 0.05.