Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Training-Free Safe Denoisers for Safe Use of Diffusion Models
Authors: Mingyu Kim, Dongjun Kim, Amman Yusuf, Stefano Ermon, Mi Jung Park
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present the experimental results of our method, Safe Denoiser. Section 5.1 details the outcomes of our text-to-image generation experiments, while the subsequent section explores both classconditional and unconditional image generation. |
| Researcher Affiliation | Academia | Mingyu Kim 1 Dongjun Kim 2 Amman Yusuf1 Stefano Ermon2 Mijung Park1 1CS, UBC 2CS, Standford EMAIL, EMAIL EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Training-Free Safe Denoiser |
| Open Source Code | No | For reproducibility, we plan to release our code upon acceptance. ... After the acceptance, we plan to release the code to the public. |
| Open Datasets | Yes | We achieve state-of-the-art safety in large-scale datasets such as the Co Pro dataset while enabling significantly more cost-effective sampling than existing methodologies. ... To evaluate safety, we follow previous studies by assessing Attack Success Rate (ASR), Toxic Rate (TR), and Inappropriate Probability (IP) [2, 1]. ... For the nudity task, we select 515 unsafe images from I2P [2] ... We use Stable Diffusion (SD) [3] v1.4 ... We use a Py Torch package [45] to compute the FID by comparing 10K reference images selected from the COCO2014 [46] validation split and 10K generated images from the prompts identically selected from the same COCO dataset. ... FFHQ [49] ... Image Net [48] ... Celeb A-HQ [59] ... The I2P dataset was obtained from https://huggingface.co/datasets/AIML-TUDA/i2p ... The Co Pro dataset was obtained from https://github.com/rt219/Latent Guard/blob/main/dataset/Co Pro_v1.0.json ... The curated Ring-A-Bell dataset was obtained from either https://github.com/Charles Gong12/RECE or https://github.com/jaehong31/SAFREE. ... The dataset was obtained from https://github.com/Charles Gong12/RECE or https://github.com/jaehong31/SAFREE. ... The dataset was obtained from https://github.com/Charles Gong12/RECE or https://github.com/jaehong31/SAFREE. |
| Dataset Splits | Yes | We use a Py Torch package [45] to compute the FID by comparing 10K reference images selected from the COCO2014 [46] validation split and 10K generated images from the prompts identically selected from the same COCO dataset. ... In this experiment, we select 1K female images from Celeb A-HQ [59] validation split to serve as unseen negative data ... Each experiment generates 50 samples per class across all 1000 Image Net classes, producing 50,000 samples that are then evaluated with a pretrained Image Net classifier for precision, recall, and classification accuracy measurements [60]. |
| Hardware Specification | Yes | Table 5 presents the wall-clock time for image generation on NVIDIA RTX4090 with 24GB memory. |
| Software Dependencies | No | We use a Py Torch package [45] to compute the FID by comparing 10K reference images selected from the COCO2014 [46] validation split and 10K generated images from the prompts identically selected from the same COCO dataset. |
| Experiment Setup | Yes | For a fair comparison, we maintain the same number of inference steps, specifically 50, aligning with the official implementations of both SLD and SAFREE, which also use 50 inference steps. ... The RBF kernel function is defined as follows: K(x, x ) = exp x x 2 For the bandwidth parameter σ, we set a value of 1.0 for SLD and 3.15 for SAFREE. Additionally, in case of SAFREE, we apply a scaling factor η = 0.33, whereas for SLD, we use η = 0.03 ... we propose to apply the safe denoiser only at the beginning of sampling process. ... we utilize the pretrained diffusion models from FFHQ [57]7 and Image Net [7]8. For the experiments, we use a DDPM solver [58] with 100 steps. ... In this experiment, we chose σ = 1.0 and η = 0.05, and employ Safe denoiser across the entire denoising timesteps. ... We condition on class labels by scaling the classifier guidance at 5.0, creating a strong pull towards the desired class during the sampling process. |