reproducibilityindex.ai

Diffusion PID: Interpreting Diffusion via Partial Information Decomposition

Authors: Shaurya Dewan, Rushikesh Zawar, Prakanshul Saxena, Yingshan CHANG, Andrew Luo, Yonatan Bisk

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our work presents Diffusion Partial Information Decomposition (Diffusion PID), a novel technique that applies information-theoretic principles to decompose the input text prompt into its elementary components, enabling a detailed examination of how individual tokens and their interactions shape the generated image. ... Our results show that PID is a potent tool for evaluating and diagnosing text-to-image diffusion models.
Researcher Affiliation	Academia	Shaurya Dewan* Rushikesh Zawar* Prakanshul Saxena* Yingshan Chang Andrew Luo Yonatan Bisk Carnegie Mellon University (srdewan, rzawar, prakanss, yingshac, afluo, ybisk)@andrew.cmu.edu
Pseudocode	No	The paper uses mathematical equations and describes procedures in prose, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Link to project page: https://rbz-99.github.io/Diffusion-PID/. ... We have shared our code and data in the supplementary. We will be releasing the code publicly before the main conference.
Open Datasets	Yes	We primarily rely on existing hierarchical datasets, Wordnet [77] and COCO [76]... Stable Diffusion 2.1 [7].
Dataset Splits	No	The paper uses a pre-trained model and generates images for analysis. It does not describe any specific training, validation, or test splits for its own experimental methodology beyond generating data.
Hardware Specification	Yes	A single A6000 GPU was used to generate the PID maps for each data sample.
Software Dependencies	No	The paper mentions using 'BERT [61]' and 'Stable Diffusion 2.1 model from Hugging Face' but does not provide specific version numbers for these or other software dependencies like Python or deep learning frameworks.
Experiment Setup	Yes	For our experiments, we primarily focus on the pre-trained Stable Diffusion 2.1 model from Hugging Face. ...Thus, all our PID computations occur in this latent space... During visualization, the heatmaps are bilinearly interpolated from this latent space to the original image resolution. Finally, we make use of 50 samples for evaluating the integral over SNRs using importance sampling. A single A6000 GPU was used to generate the PID maps for each data sample.