Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models

Authors: Tomáš Souček, Sylvestre-Alvise Rebuffi, Pierre Fernandez, Nikola Jovanović, Hady Elsahar, Valeriu Lacatusu, Tuan Tran, Alexandre Mourachko

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present the experimental setup and results of our watermark forging approach. First, in Section 4.1, we describe the key implementation and evaluation details. Then, in Section 4.2, we compare our approach of watermark forging and removal to related methods. Finally, in Section 4.3, we ablate the key design choices.
Researcher Affiliation Collaboration 1Meta FAIR 2ETH Zurich EMAIL
Pseudocode No The paper describes the methodology in prose and mathematical equations (e.g., Equation (1), (2), (3), (4)) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code and further resources are publicly available3. 3https://github.com/facebookresearch/videoseal/tree/main/wmforger
Open Datasets Yes We train our model, Conv Ne Xt V2-Tiny [43] on images from the SA-1b dataset [20].
Dataset Splits Yes We watermark 100 images from the SA-1b validation set by all tested watermarking methods: CIN [29], MBRS [17], Trust Mark [4], and Video Seal [12]. For each watermarking method, we watermark all 100 images using the same hidden message... For watermark forging, we steal the watermarks from the same 100 test images and apply the stolen watermarks to a new set of 100 images.
Hardware Specification Yes The model is trained from scratch for 120k steps on 8 GPUs with a batch size of 16 per GPU; the training takes 60 hours using V100 GPUs. [...] The SGD runtime for a single image using Quatro GP100 GPU is six seconds for k = 50 steps using vanilla Py Torch without JIT optimization.
Software Dependencies No The paper mentions using ConvNeXt V2-Tiny as the model architecture and optimizers like AdamW and SGD, but does not provide specific version numbers for any software frameworks or libraries like PyTorch, TensorFlow, or Python itself.
Experiment Setup Yes We resize each image to the resolution of 768 768 and apply a random synthetic artifact to it. Then, both the image with and without artifact are augmented by the same random image augmentation followed by the same random crop of size 256 256. The model is trained from scratch for 120k steps on 8 GPUs with a batch size of 16 per GPU; the training takes 60 hours using V100 GPUs. We use Adam W optimizer with a fixed learning rate of 1 10 5. In every second batch, we replace the image with the synthetic artifact by its adversarially perturbed version as described in Section 3.1. To compute the perturbation, we use two steps of gradient descent with a learning rate randomly chosen from the interval [0.03, 0.09].