Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models
Authors: Lexiang Xiong, Liu Chengyu, Jingwen Ye, YAN LIU, Yuecong Xu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on object, explicit content, artistic style, and multi-celebrity erasure tasks, demonstrating that our method significantly outperforms state-of-the-art approaches. That is, our proposed concept erasure framework achieves superior completeness and robustness while preserving locality and general image quality(e.g., achieving a 93.58 H-score in object erasure, reducing explicit content to just 1 instance with a 12.2 FID, and attaining an 8.09 Ha in style erasure with no MS-COCO FID/CLIP degradation). |
| Researcher Affiliation | Academia | 1National University of Singapore 2Sichuan University |
| Pseudocode | No | The paper describes the methodology using prose and mathematical equations (e.g., Eq. 1-18) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | Yes | Our code is publicly available at https://github.com/Lexiang-Xiong/Semantic-Surgery. |
| Open Datasets | Yes | We evaluate Semantic Surgery on five diverse erasure challenges. The first four are standard benchmarks: Object Erasure (CIFAR-10 classes [22]), Explicit Content Removal (I2P dataset [43]), and both Artistic Style and Multi-Concept Celebrity Erasure [27]. To specifically address a critical aspect of security, our fifth evaluation focuses on Robustness Against Adversarial Attacks, where we test our method s resilience against both black-box (RAB [50]) and white-box (Unlearn Diff Atk [59]) adversarial prompts. |
| Dataset Splits | Yes | We evaluate object erasure on the 10 categories of the CIFAR-10 dataset [22]. For each category, a model is configured to erase that specific target concept. We measure Efficacy (Acc E), Robustness (Acc R), and Locality (Acc L). Acc E is the percentage of successful erasures for simple prompts (e.g., "A photo of {class}"). Acc R measures erasure success for paraphrased prompts (e.g., "A sleek jetliner soaring through clear skies" for "airplane"), generated via Chat GPT as per Receler [19]. Acc L is the generation accuracy for the nine non-target classes using their paraphrased prompts. ... To assess the erasure of nuanced attributes, we focus on removing artistic styles. ... Following the methodology of MACE [27], we utilize the Image Synthesis Style Studies Database [20] to curate a set of 200 distinct artists. This set is divided into an "erasure group" of 100 artists, whose styles are targeted for removal, and a "retention group" of 100 artists, whose styles should remain generatable. ... We adopt MACE s setup [27] for celebrity erasure: a dataset of 200 celebrities (100 "erasure group", 100 "retention group"). We erase 1, 5, 10, and all 100 celebrities from the erasure group. |
| Hardware Specification | Yes | Table 7: Inference time analysis per image (50 steps) on a single NVIDIA RTX4090 GPU. |
| Software Dependencies | No | The paper mentions using "Stable Diffusion v1.4 [42]" and the "DDIM sampler (50 steps)" as well as specific detectors like "OWL-Vi T detector [31]" and "Nude Net [6]". However, it does not provide specific version numbers for any software libraries or dependencies, which is required for a 'Yes' answer. |
| Experiment Setup | Yes | Our method s key hyperparameters are set as γ = 0.02 and τ = 0.5, with task-specific thresholds β. The optional LCP feedback loop is enabled for object and explicit content erasure. ... All experiments use Stable Diffusion v1.4 [42] with the DDIM sampler (50 steps). Table 5: Summary of key hyperparameter settings and LCP visual feedback configuration across experiments. Experiment Task β (Decision Threshold) γ (Steepness) τ (Activation Threshold) Visual Feedback (LCP) λvis (if LCP active) CIFAR-10 Object Erasure -0.12 0.02 (Global) 0.5 (Global) Yes (AOD [32]) 1.0 I2P Explicit Content Removal -0.06 0.02 (Global) 0.5 (Global) Yes (Nude Net [6]) 1.0 Artistic Style Erasure -0.30 0.02 (Global) 0.5 (Global) No N/A Multi-Concept Celebrity Erasure -0.28 0.02 (Global) 0.5 (Global) No N/A Adversarial Robustness -0.06 0.02 (Global) 0.5 (Global) No N/A |