Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models
Authors: Simone Carnemolla, Matteo Pennisi, Sarinda Samarasinghe, Giovanni Bellitto, Simone Palazzo, Daniela Giordano, Mubarak Shah, Concetto Spampinato
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Quantitative and qualitative evaluations, including a user study, show that DEXTER produces accurate, interpretable outputs. Experiments on Image Net, Waterbirds, Celeb A, and Fair Faces confirm that DEXTER outperforms existing approaches in global model explanation and class-level bias reporting. |
| Researcher Affiliation | Academia | 1University of Catania 2University of Central Florida EMAIL EMAIL |
| Pseudocode | Yes | B DEXTER Algorithm Algorithm 1 DEXTER |
| Open Source Code | Yes | Code is available at https://github.com/perceivelab/dexter. |
| Open Datasets | Yes | Experiments on Image Net, Waterbirds, Celeb A, and Fair Faces confirm that DEXTER outperforms existing approaches in global model explanation and class-level bias reporting. |
| Dataset Splits | No | The paper uses standard datasets such as Image Net, Waterbirds, Celeb A, and Fair Faces, and references external training schemes like the debiased training scheme from [35], but it does not explicitly state the training, validation, or test split percentages or sample counts for the experiments conducted in this paper. |
| Hardware Specification | Yes | All experiments ran in half-precision on three H100 GPUs. |
| Software Dependencies | Yes | We adopt CLIP as the text encoder and Stable Diffusion v1.4 2 as the diffusion model. To reduce inference time, we employ the Latent Consistency Model (LCM) Lo RA adapter 3 using 4 inference steps. Hugging Face Stable Diffusion id: compvis/stable-diffusion-v1-4. Hugging Face Lo RA id: latent-consistency/lcm-lora-sdv1-5. |
| Experiment Setup | Yes | DEXTER is trained with a batch size of 1 (i.e., one image per iteration) and a learning rate of 0.1 across all tasks. ... for multi-word optimization, the fixed prompt is a picture of a [MASK] with [MASK] and [MASK] and [MASK] and [MASK] and [MASK]. We set the sequence P of soft prompts p to 1... The temperature τ for the Gumbel softmax is kept at its default value of 1.0. ... We used 1000 DEXTER optimization steps and single word prompting... For bias reasoning, we generate 50 images... over the course of up to 5,000 optimization steps... temperature of 0.2... max tokens parameter is set to 0... top_p is fixed at 1.0, while both the frequency_penalty and the presence_penalty are set to 0.0. |