Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Emergence and Evolution of Interpretable Concepts in Diffusion Models

Authors: Berk Tinaz, Zalan Fabian, Mahdi Soltanolkotabi

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform extensive experiments on the features of a popular, large-scale text-to-image DM, Stable Diffusion v1.4 [40], and extract thousands of concepts via SAEs. We perform extensive experiments on SD v1.4 aimed at understanding how internal representations emerge and evolve through the generative process.
Researcher Affiliation Academia Berk Tinaz Zalan Fabian Mahdi Soltanolkotabi Dept. of Electrical and Computer Engineering University of Southern California Los Angeles, CA, USA EMAIL EMAIL EMAIL
Pseudocode No The paper describes the SAE architecture, loss functions, and intervention techniques using mathematical equations and descriptive text, but it does not present any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/berktinaz/stable-concepts. We release the code base and accompanying model checkpoints. The data we use is already open-source.
Open Datasets Yes We use 200k training prompts from the LAION-COCO dataset [44] and store β„“,t RHβ„“ Wβ„“ dβ„“, the difference between the output and input of the β„“th crossattention transformer block at diffusion time t (i.e. the update to the residual stream). We use a filtered version of the LAION-COCO3 dataset for training prompts. 3https://huggingface.co/datasets/guangyil/laion-coco-aesthetic, license: apache-2.0
Dataset Splits Yes We sample 40k prompts from the LAION-COCO dataset from a split that has not been used to train the SAEs. We sample 5k LAION-COCO test prompts that have not been used for SAE training or to build the concept dictionary, and generate corresponding images with SDv1.4.
Hardware Specification Yes We train all models with Adam optimizer on a single NVIDIA RTX A6000 GPU.
Software Dependencies No The paper mentions specific models and optimizers like 'Stable Diffusion v1.4' and 'Adam optimizer', but it does not specify programming languages, libraries, or other software dependencies with version numbers.
Experiment Setup Yes Training hyperparameters are as follows: Ξ± = 1/32, batch_size: 4096, d = 1280, learning_rate: 0.0001, kaux = 256, n_epochs: 1, nf = 4d = 5120. We use guidance scale of Ο‰ = 7.5 and 50 DDIM steps to collect the activations.