Boosting Alignment for Post-Unlearning Text-to-Image Generative Models

Authors: Myeongseob Ko, Henry Li, Zhun Wang, Jonathan Patsenker, Jiachen (Tianhao) Wang, Qinbin Li, Ming Jin, Dawn Song, Ruoxi Jia

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation demonstrates that our method effectively removes target classes from recent diffusion-based generative models and concepts from stable diffusion models while maintaining close alignment with the models original trained states, thus outperforming stateof-the-art baselines.
Researcher Affiliation Academia Myeongseob Ko Virginia Tech myeongseob@vt.edu Henry Li Yale University henry.li@yale.edu Zhun Wang University of California, Berkeley zhun.wang@berkeley.edu Jonathan Patsenker Yale University jonathan.patsenker@yale.edu Jiachen T. Wang Princeton University tianhaowang@princeton.edu Qinbin Li University of California, Berkeley liqinbin1998@gmail.com Ming Jin Virginia Tech jinming@vt.edu Dawn Song University of California, Berkeley dawnsong@berkeley.edu Ruoxi Jia Virginia Tech ruoxijia@vt.edu
Pseudocode No The paper does not contain any pseudocode blocks or clearly labeled algorithm sections.
Open Source Code Yes Our code will be made available at https://github.com/ reds-lab/Restricted_gradient_diversity_unlearning.git.
Open Datasets Yes For our CIFAR-10 experiments, we leverage the EDM framework [Karras et al., 2022]...For dataset construction, we used all samples in each class for the CIFAR-10 forgetting dataset and 800 samples for Stable Diffusion experiments.
Dataset Splits Yes We evaluate model performance on both training prompts (Dr,train) used during unlearning and a separate set of held-out test prompts (Dr,test). These two distinct sets are constructed by carefully splitting semantic dimensions (e.g., activities, environments, moods). Detailed construction procedures for both sets are provided in Appendix D.
Hardware Specification Yes All experiments were conducted using an NVIDIA H100 GPU.
Software Dependencies No The paper mentions using the 'EDM framework' and 'pre-trained Stable Diffusion version 1.4', but it does not specify concrete version numbers for ancillary software like Python, PyTorch, TensorFlow, or CUDA libraries.
Experiment Setup Yes Both implementations require two key hyperparameters: the weight λ of the gradient descent direction relative to the ascent direction, and the loss truncation value α... Detailed hyperparameter configurations are provided in Appendix C. ... For experiments on CIFAR-10, we implemented our method using hyperparameters α = 1 10 1 and λ = 5. Our EDM implementation used a batch size of 64, a duration parameter of 0.05, and a learning rate of 1e-5.