Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

GaussMarker: Robust Dual-Domain Watermark for Diffusion Models

Authors: Kecen Li, Zhicong Huang, Xinwen Hou, Cheng Hong

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Gauss Marker efficiently achieves state-of-the-art performance under eight image distortions and four advanced attacks across three versions of Stable Diffusion with better recall and lower false positive rates, as preferred in real applications. ... Thorough experiments show that, on three stable diffusion models and eight image distortions, the average true positive rate and bit accuracy of Gauss Marker surpasses existing methods, validating the superiority of Gauss Marker in watermarking diffusion models.
Researcher Affiliation	Collaboration	1Institute of Automation, Chinese Academy of Sciences 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3Ant Group. Correspondence to: Zhicong Huang <EMAIL>.
Pseudocode	No	The paper describes methods using mathematical equations and descriptive text, but no explicitly labeled 'Pseudocode' or 'Algorithm' blocks are present.
Open Source Code	No	All these baselines are implemented with their source code. ... 3https://github.com/facebookresearch/ stable_signature 4https://github.com/Zhenting Wang/ Latent Tracer. The paper provides links for baseline implementations but does not explicitly state that the source code for Gauss Marker is open-source or publicly available.
Open Datasets	Yes	The FID is calculated using 5000 prompt and image pairs sampled from MS-COCO (Lin et al., 2014). ... Regeneration Attack. Following a recent benchmark (An et al., 2024), we utilize a diffusion model (Dhariwal & Nichol, 2021) pre-trained on Image Net to perform regeneration attack.
Dataset Splits	Yes	We train the Fuser ... and generate 100 watermarked images and 100 unwatermarked images for training. ... Each baseline generates 1,000 watermarked images and 1,000 un-watermarked images for evaluation. ... The FID is calculated using 5000 prompt and image pairs sampled from MS-COCO (Lin et al., 2014).
Hardware Specification	Yes	the training of GNR only needs 72 minutes on 1 V100 32G GPU.
Software Dependencies	No	The paper mentions implementing GNR as a UNet and Fuser as a two-layer MLP, and using Cha Cha20 for shuffling, but does not provide specific version numbers for software libraries or frameworks like Python, PyTorch, or CUDA.
Experiment Setup	Yes	For DDIM inversion, we use a guidance scale of 0, 50 inversion steps and, an empty prompt. For spatial watermark, we use a random 256-bit watermark and take a secure stream cipher, Cha Cha20 (Bernstein et al., 2008), as the Shuffle. We implement GNR as a 30M UNet and train it with a learning rate of 0.0001, batch size of 32, and training steps of 50,000. ... We train the Fuser with a learning rate of 0.001, batch size of 200, and training steps of 1,000. ... For generation, we use the prompts from Stable-Diffusion-Prompts with a guidance scale of 7.5 and 50 sampling steps.