reproducibilityindex.ai

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

Authors: Zhenting Wang, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas, Shiqing Ma

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on Stable Diffusion and VQ Diffusion with different model training or fine-tuning methods (i.e, Lo RA, Dream Booth, and standard training) demonstrate the effectiveness of our proposed method in detecting unauthorized data usages.
Researcher Affiliation	Collaboration	1Rutgers University, 2Sony AI, 3University of Massachusetts Amherst
Pseudocode	Yes	Algorithm 1 Data Coating and Algorithm 2 Unauthorized Data Usages Detection are provided as pseudocode blocks.
Open Source Code	Yes	Code: https://github.com/Zhenting Wang/DIAGNOSIS.
Open Datasets	Yes	Four datasets (i.e., Pokemon3, Celeb A (Liu et al., 2015), CUB-200 (Wah et al., 2011)) and Dog (Ruiz et al., 2023) are used.
Dataset Splits	Yes	Note that we spit part of the samples (10% by default) in D as the validation set for developing the signal classifier.
Hardware Specification	Yes	We conduct all experiments on a Ubuntu 20.04 server equipped with six Quadro RTX 6000 GPUs.
Software Dependencies	Yes	Our method is implemented with Python 3.9 and Py Torch 2.0.1.
Experiment Setup	Yes	By default, the coating rate we used for unconditional injected memorization and the trigger-conditioned injected memorization are 100.0% and 20.0%, respectively. We use 50 text prompts to approximate the memorization strength (i.e., N = 50) by default. The default warping strength are 2.0 and 1.0 for unconditional injected memorization and trigger-conditioned injected memorization, respectively. The default hyper-parameters for the hypothesis testing are discussed in 3.3.