Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models

Authors: Die Chen, Zhiwen Li, Mingyuan Fan, Cen Chen, Wenmeng Zhou, Yanhao Wang, Yaliang Li

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experimentation, we demonstrate that our approach achieves superior erasure results with little effect on other concepts while preserving image quality and semantics.
Researcher Affiliation	Collaboration	1School of Data Science and Engineering, East China Normal University 2Alibaba Group EMAIL EMAIL EMAIL EMAIL EMAIL
Pseudocode	Yes	Algorithm 1: Growth Inhibitors for Erasure (GIE) Input: A prompt P and a target concept P to be erased. Output: An image xsafe where the concept P has been erased. Encode the prompt as c = Encoder(P) and the target concept as c = Encoder(P ); Draw a sample z T from Gaussian distribution N(0, I); Let [s + 1 : e] be the interval where the token of the target concept is located; w Adapter(zt, c, t = T); for t = T, T 1, . . . , 1 do M DM(zt, c, t); M DM(zt, c , t); I Extract(M , w, s + 1, e 1); Mreplace Inject(M, I); creplace Inject(c, c [s+1:e]); zt 1 DM(zt, creplace, t){M Mreplace}; end Return xsafe z0;
Open Source Code	Yes	Our code and data are publicly available at https://github.com/CD22104/ Growth-Inhibitors-for-Erasure.
Open Datasets	Yes	In the NSFW content erasure task, we use the inappropriate image prompts (I2P) dataset (Schramowski et al., 2023) to examine the generation results for both implicit and explicit unsafe prompts. ... We also evaluate whether the semantics and quality of the generated images remain unaffected after concept erasure using the COCO-30K prompt dataset (Lin et al., 2014), which consists of 30,000 natural language descriptions of daily scenes.
Dataset Splits	No	The paper mentions training an adapter using a limited number of samples ('a few dozen images', '60 prompts') but does not provide specific train/test/validation splits or percentages for these.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions using pre-trained models and tools like CLIP, Nude Net, and GPT-4o, but does not provide specific software dependencies with version numbers for their implementation.
Experiment Setup	Yes	The training process uses the mean squared error as the loss function, Adam as the optimizer with a learning rate lr = 0.001, and sets the training epochs at 2,000.