reproducibilityindex.ai

Direct Unlearning Optimization for Robust and Safe Text-to-Image Models

Authors: Yong-Hyun Park, Sangdoo Yun, Jin-Hwa Kim, Junho Kim, Geonhui Jang, Yonghyun Jeong, Junghyo Jo, Gayoung Lee

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that DUO can robustly defend against various state-of-the-art red teaming methods without significant performance degradation on unrelated topics, as measured by FID and CLIP scores.
Researcher Affiliation	Collaboration	1Department of Physics Education, Seoul National University 2School of Industrial and Management Engineering, Korea University 3NAVER AI Lab 4NAVER Cloud 5Korea Institute for Advanced Study (KIAS) 6AI Institute of Seoul National University or SNU AIIS
Pseudocode	No	The paper does not include a pseudocode or algorithm block.
Open Source Code	No	We will publicly open the source-code for reproducible.
Open Datasets	Yes	To evaluate model performance unrelated to the unlearned concept, we measure the FID [19] and CLIP scores [18] using MS COCO 30k validation dataset [30].
Dataset Splits	Yes	To evaluate model performance unrelated to the unlearned concept, we measure the FID [19] and CLIP scores [18] using MS COCO 30k validation dataset [30].
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for experiments.
Software Dependencies	No	The paper mentions 'Stable Diffuson 1.4v' and 'Lo RA', 'Adam optimizer', but does not provide specific version numbers for key software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or other libraries beyond 'Python 3.8' from the checklist.
Experiment Setup	Yes	We use Stable Diffuson 1.4v (SD1.4v) with a Lo RA [23, 48] rank of 32 with the Adam optimizer for fine-tuning. For generating the unsafe images x , we use naked as prompt with a guidance strength of 7.5. When we use SDEdit, the magnitude of the added noise is t = 0.75T, where T is the maximum diffusion timesteps, and the guidance scale used is 7.5. Using β = 100 as a baseline, we use a learning rate of 3 10 4, the batch size of 4, and Lo RA rank of 32.