VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models

Authors: Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct a comprehensive study on the generalizability of our attack framework. We use caption as the trigger to backdoor conditional DMs in Section 4.1. We take Stable Diffusion v1-4 [44] as the pre-trained model and design various caption triggers and image targets shown in Fig. 2. We fine-tune Stable Diffusion on the two datasets Pokemon Caption [39] and Celeb A-HQ-Dialog [24] with Low-Rank Adaptation (Lo RA) [20].
Researcher Affiliation Collaboration Sheng-Yen Chou The Chinese University of Hong Kong shengyenchou@cuhk.edu.hk Pin-Yu Chen IBM Research pin-yu.chen@ibm.com Tsung-Yi Ho The Chinese University of Hong Kong tyho@cse.cuhk.edu.hk
Pseudocode Yes Algorithm 1 Backdoor Unconditional DMs with Image Trigger
Open Source Code Yes Our code is available on Git Hub: https://github.com/IBM/villandiffusion
Open Datasets Yes We fine-tune Stable Diffusion on the two datasets Pokemon Caption [39] and Celeb A-HQ-Dialog [24] with Low-Rank Adaptation (Lo RA) [20]. fine-tune the pre-trained diffusion models google/ddpm-cifar10-32 with learning rate 2e-4 and 128 batch size for 100 epochs on the CIFAR10 dataset.
Dataset Splits No The dataset is split into 90% training and 10% testing. (No explicit mention of a validation split for reproduction.)
Hardware Specification Yes All experiments were conducted on s Tesla V100 GPU with 32 GB memory.
Software Dependencies No No specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8') are provided.
Experiment Setup Yes We fine-tune the pre-trained stable diffusion model [44, 45] with the frozen text encoder and set learning rate 1e-4 for 50000 training steps. For the backdoor loss, we set ηi p = ηi c = 1, i for the loss Eq. (15). We also set the Lo RA [20] rank as 4 and the training batch size as 1. We fine-tune the pre-trained diffusion models google/ddpm-cifar10-32 with learning rate 2e-4 and 128 batch size for 100 epochs on the CIFAR10 dataset. To accelerate the training, we use half-precision (float16) training.