Localizing and Editing Knowledge In Text-to-Image Generative Models

Authors: Samyadeep Basu, Nanxuan Zhao, Vlad I Morariu, Soheil Feizi, Varun Manjunatha

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we empirically study this question towards understanding how knowledge corresponding to different visual attributes is stored in text-to-image models, using Stable Diffusion(Rombach et al., 2021) as a representative model. In particular, we adapt Causal Mediation Analysis (Vig et al., 2020; Pearl, 2001) for large-scale text-to-image diffusion models to identify specific causal components in the (i) UNet and (ii) the text-encoder where visual attribute knowledge resides.
Researcher Affiliation Collaboration Samyadeep Basu1, Nanxuan Zhao2, Vlad Morariu2, Soheil Feizi*1, Varun Manjunatha*2 1: University of Maryland, 2: Adobe Research
Pseudocode No The paper describes the mathematical formulation of the DIFF-QUICKFIX method with equations (8) and (9) but does not provide a formal pseudocode block or algorithm box.
Open Source Code Yes Code at https://github.com/samyadeepbasu/Diff Quick Fix.
Open Datasets Yes For removing concepts such as artistic styles or objects using DIFF-QUICKFIX, we use the prompt dataset from (Kumari et al., 2023).
Dataset Splits No The paper mentions selecting 'a small validation set of 10 prompts per attribute' for determining the optimal threshold for CLIP-Score. However, it does not provide specific dataset split information (percentages, sample counts) for the overall dataset used in experiments, or reference to predefined standard splits.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It does not mention any cloud or cluster resources with specifications.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We validate DIFF-QUICKFIX by applying edits to a Stable-Diffusion (Rombach et al., 2021) model and quantifying the efficacy of the edit. For removing concepts such as artistic styles or objects using DIFF-QUICKFIX, we use the prompt dataset from (Kumari et al., 2023). For updating knowledge (e.g., President of a country) in text-to-image models, we add newer prompts to the prompt dataset from (Kumari et al., 2023) and provide further details in Appendix N. We compare our method with (i) Original Stable-Diffusion; (ii) Editing methods from (Kumari et al., 2023) and (Gandikota et al., 2023). To validate the effectiveness of editing methods including our DIFF-QUICKFIX, we perform evaluation using automated metrics such as CLIP-Score.