Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Unveiling Concept Attribution in Diffusion Models
Authors: Nguyen Hung-Quang, Hoang Phan, Khoa D Doan
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results validate the significance of both positive and negative components pinpointed by our framework, demonstrating the potential of providing a complete view of interpreting generative models. Our code is available here. ...We analyze CAD and evaluate the effectiveness of the proposed editing algorithms with extensive experiments, demonstrating their practicality and effectiveness. |
| Researcher Affiliation | Academia | Quang H. Nguyen1, Hoang Phan1,2, Khoa D. Doan1,2 1College of Engineering and Computer Science, Vin University 2Vin Uni-Illinois Smart Health Center, Vin University EMAIL |
| Pseudocode | Yes | Algorithm 1: CAD-Erase Input: Diffusion model Φ, target concept c, base condition cb, the number of components k. Output: Diffusion model Φ with a lower chance to generate concept c. Generate a set of x conditioned on c. Compute the scores wi J wi with Eq. (4). Locate top-k components wi S with the (positive) attribution. Set wi 0, wi S. Algorithm 2: CAD-Amplify Input: Diffusion model Φ, target concept c, the n.o. components k, images x of concept c. Output: Diffusion model Φ with a higher chance to generate concept c. Compute the scores wi J wi with Eq. (5). Locate top-k components wi S with the lowest (negative) attribution. Set wi 0, wi S |
| Open Source Code | Yes | Extensive experimental results validate the significance of both positive and negative components pinpointed by our framework, demonstrating the potential of providing a complete view of interpreting generative models. Our code is available here. |
| Open Datasets | Yes | We select 10 classes from Image Nette, cassette player , chain saw , church , English springer , french horn , garbage truck , gas pump , golf ball , parachute , and tench . We validate the performance on unrelated knowledge by generating images with 30, 000 prompts in the COCO dataset [23]. |
| Dataset Splits | Yes | We select 10 classes from Image Nette, cassette player , chain saw , church , English springer , french horn , garbage truck , gas pump , golf ball , parachute , and tench . For each class, we compute component attributions and ablate 0.1% components using Algorithm 1. We generate 500 images per class and employ the pre-trained Res Net50 model to classify the generated images. We validate the performance on unrelated knowledge by generating images with 30, 000 prompts in the COCO dataset [23]. |
| Hardware Specification | Yes | We conduct the experiments on RTX A5000 GPUs. |
| Software Dependencies | No | The paper does not explicitly state specific software dependencies with version numbers. |
| Experiment Setup | Yes | For each class, we compute component attributions and ablate 0.1% components using Algorithm 1. We locate and ablate the top 0.075% positive components with the prompt naked . For the nudity concept, we apply a mask at the initial denoising step with ˆt = 9 and a sparsity level of k = 1%. For object removal in the Imagenette classes, we use ˆt = 10 and k = 2%. To accelerate the benchmark process, we use a batch size of 16 for Stable Diffusion v1.4 and 8 for Stable Diffusion v2.1. This allows us to evaluate using a single A5000 GPU. We maintain a consistent seed of 0 for all benchmark experiments. |