Ring-A-Bell! How Reliable are Concept Removal Methods For Diffusion Models?
Authors: Yu-Lin Tsai, Chia-Yi Hsu, Chulin Xie, Chih-Hsun Lin, Jia You Chen, Bo Li, Pin-Yu Chen, Chia-Mu Yu, Chun-Ying Huang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments evaluate a wide range of models, ranging from popular online services to state-of-the-art concept removal methods, and reveal that problematic prompts generated by Ring-A-Bell can increase the success rate for most concept removal methods in generating inappropriate images by more than 30%. |
| Researcher Affiliation | Collaboration | Chia-Yi Hsu , Yu-Lin Tsai National Yang Ming Chiao Tung University {chiayihsu8315,uriah1001}@gmail.com Chulin Xie University of Illinois at Urbana Champaign chulinx2@illinois.edu Chih-Hsun Lin, Jia-You Chen National Yang Ming Chiao Tung University {pkevawin334, justin041510}@gmail.com Bo Li University of Illinois at Urbana Champaign University of Chicago lbo@illnois.edu, bol@uchicago.edu Pin-Yu Chen IBM Research pin-yu.chen@ibm.com Chia-Mu Yu, Chun-Ying Huang National Yang Ming Chiao Tung University chiamuyu@gmail.com, chuang@cs.nctu.edu.tw |
| Pseudocode | No | The paper describes its methods but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our codes are available at https://github.com/chiayi-hsu/Ring-A-Bell. |
| Open Datasets | Yes | Dataset. We evaluate the performance of Ring-A-Bell on the I2P dataset (Schramowski et al., 2023), an established dataset of problematic prompts, on the concepts of nudity and violence. |
| Dataset Splits | No | The paper describes selecting subsets of prompts from the I2P dataset for evaluation but does not specify train, validation, or test splits for its own experimental methodology. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions using CLIP with VIT-L/14 but does not specify version numbers for other key software dependencies or libraries. |
| Experiment Setup | Yes | To run the GA, we use 200 random initial prompts with 3000 generations and set the mutation rate and crossover rate to 0.25 and 0.5, respectively. Furthermore, there are hyper-parameters: K (the length of the prompts), η (the weight of the empirical concept), and N (the number of prompt pairs). In Section 4.3, we will show how K, η, N as well as the choice of optimizer affect the attack results. |