Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Emergence and Evolution of Interpretable Concepts in Diffusion Models
Authors: Berk Tinaz, Zalan Fabian, Mahdi Soltanolkotabi
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments on the features of a popular, large-scale text-to-image DM, Stable Diffusion v1.4 [40], and extract thousands of concepts via SAEs. We perform extensive experiments on SD v1.4 aimed at understanding how internal representations emerge and evolve through the generative process. |
| Researcher Affiliation | Academia | Berk Tinaz Zalan Fabian Mahdi Soltanolkotabi Dept. of Electrical and Computer Engineering University of Southern California Los Angeles, CA, USA EMAIL EMAIL EMAIL |
| Pseudocode | No | The paper describes the SAE architecture, loss functions, and intervention techniques using mathematical equations and descriptive text, but it does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/berktinaz/stable-concepts. We release the code base and accompanying model checkpoints. The data we use is already open-source. |
| Open Datasets | Yes | We use 200k training prompts from the LAION-COCO dataset [44] and store β,t RHβ Wβ dβ, the difference between the output and input of the βth crossattention transformer block at diffusion time t (i.e. the update to the residual stream). We use a filtered version of the LAION-COCO3 dataset for training prompts. 3https://huggingface.co/datasets/guangyil/laion-coco-aesthetic, license: apache-2.0 |
| Dataset Splits | Yes | We sample 40k prompts from the LAION-COCO dataset from a split that has not been used to train the SAEs. We sample 5k LAION-COCO test prompts that have not been used for SAE training or to build the concept dictionary, and generate corresponding images with SDv1.4. |
| Hardware Specification | Yes | We train all models with Adam optimizer on a single NVIDIA RTX A6000 GPU. |
| Software Dependencies | No | The paper mentions specific models and optimizers like 'Stable Diffusion v1.4' and 'Adam optimizer', but it does not specify programming languages, libraries, or other software dependencies with version numbers. |
| Experiment Setup | Yes | Training hyperparameters are as follows: Ξ± = 1/32, batch_size: 4096, d = 1280, learning_rate: 0.0001, kaux = 256, n_epochs: 1, nf = 4d = 5120. We use guidance scale of Ο = 7.5 and 50 DDIM steps to collect the activations. |