reproducibilityindex.ai

Data Attribution for Text-to-Image Models by Unlearning Synthesized Images

Authors: Sheng-Yu Wang, Aaron Hertzmann, Alexei Efros, Jun-Yan Zhu, Richard Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method with a computationally intensive but gold-standard retraining from scratch and demonstrate our method s advantages over previous methods. Our experiments show that our algorithm outperforms prior work on both benchmarks, demonstrating that unlearning synthesized images is an effective way to attribute training images.
Researcher Affiliation	Collaboration	1Carnegie Mellon University 2Adobe Research 3UC Berkeley
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at: https://peterwang512.github.io/ Attribute By Unlearning.
Open Datasets	Yes	We use MSCOCO [25] ( 100k images), which allows for retraining models within a reasonable compute budget.
Dataset Splits	No	The paper mentions using the MSCOCO 2017 training split and text prompts from the MSCOCO validation set for evaluation, but it does not specify explicit training/validation/test dataset splits with percentages or sample counts for its own model training.
Hardware Specification	Yes	We conduct all of our experiments on A100 GPUs.
Software Dependencies	No	The paper does not provide specific version numbers for general software dependencies (e.g., Python, PyTorch, CUDA) used in their experimental setup, only mentioning specific models like "Stable Diffusion v2" or "Vi T-B/32".
Experiment Setup	Yes	To retrain each MSCOCO model for leave-K-out evaluation, we follow the same training recipe as the source model, where each model is trained with 200 epochs, a learning rate of 10 4, and a batch size of 128. To unlearn a synthesized sample in MSCOCO models, we find that running with 1 step already yields good attribution performance. We perform Newton unlearning updates with step sizes of 0.01 and update only cross-attention KV (W k, W v).