Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation

Authors: Anh Bui, Tung-Long Vuong, Khanh Doan, Trung Le, Paul Montague, Tamas Abraham, Dinh Phung

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our method using the Stable Diffusion model, showing that it outperforms state-of-the-art erasure methods in eliminating unwanted content while maintaining the integrity of other unrelated elements. Our code is available at https: //github.com/tuananhbui89/Erasing-Adversarial-Preservation. and In this section, we present a series of experiments to evaluate the effectiveness of our method in erasing various types of concepts from the foundation model.
Researcher Affiliation Collaboration Anh Bui1 Long Vuong1 Khanh Doan2 Trung Le1 Paul Montague3 Tamas Abraham3 Dinh Phung1 1Monash University 2Vin AI Research 3Defence Science and Technology Group, Australia
Pseudocode Yes The pseudo-algorithm involves a two-step optimization process, outlined in Algorithm 1: Finding Adversarial Concept and Algorithm 2: Adversarial Erasure Training.
Open Source Code Yes Our code is available at https: //github.com/tuananhbui89/Erasing-Adversarial-Preservation. and Our code is anonymously published at https://anonymous.4open. science/r/Adversarial-Erasing/.
Open Datasets Yes We choose Imagenette 1 which is a subset of the Image Net dataset Deng et al. (2009) which comprises 10 easily recognizable classes, including 'Cassette Player', 'Chain Saw', 'Church', 'Gas Pump', 'Tench', 'Garbage Truck', 'English Springer', 'Golf Ball', 'Parachute', and 'French Horn'. with footnote 1https://github.com/fastai/imagenette. Also, Additionally, to measure the preserving performance, we generate images with COCO 30K prompts and measure the FID score compared to COCO 30K validation images.
Dataset Splits Yes Additionally, to measure the preserving performance, we generate images with COCO 30K prompts and measure the FID score compared to COCO 30K validation images.
Hardware Specification Yes Our models are trained on 1 NVIDIA A100 GPUs of 80GB.
Software Dependencies No The paper mentions 'Adam optimizer' and 'Stable Diffusion (SD) version 1.4' but does not specify version numbers for other key software components or libraries (e.g., Python, PyTorch, CUDA).
Experiment Setup Yes We maintain consistent settings across all methods: finetuning the model for 1000 steps with a batch size of 1, using the Adam optimizer with a learning rate of α = 10 5. and For searching hyperparameters, we use Niter = 2, η = 1 10 3, and a trade-off λ = 1 as the default settings.