Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Certified Neural Network Watermarks with Randomized Smoothing

Authors: Arpit Bansal, Ping-Yeh Chiang, Michael J Curry, Rajiv Jain, Curtis Wigington, Varun Manjunatha, John P Dickerson, Tom Goldstein

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our ﬁrst set of experiments, we investigate the strength of our certiﬁcate under two datasets and three watermark schemes. In our second set of experiments, we evaluate the watermark s empirical robustness to removal compared to previous methods that claimed resistance to removal attacks.
Researcher Affiliation	Collaboration	1University of Maryland, College Park 2Adobe Research, USA.
Pseudocode	Yes	Algorithm 1 Embed Certiﬁable Watermark Algorithm 2 Evaluate and Certify the Median Smoothed Model
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for their method, nor does it provide a link to a code repository.
Open Datasets	Yes	For example, MNIST images could form the trigger set for a CIFAR-10 network. To train the watermarked model, we used Res Net-18. We used the Watermark-Robustness-Toolbox to conduct the additional persistence evaluation.
Dataset Splits	No	No explicit train/validation/test splits were provided beyond 'Only 50% of the data is used for training, since we reserve the other half for the adversary.'
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. It mentions 'TPU cost' in reference to GPT-3 but not for their own experimental setup.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers. It mentions using 'ResNet-18' and optimizers like 'SGD' and 'Adam', but not the software framework or library versions used for implementation.
Experiment Setup	Yes	To train the watermarked model, we used Res Net-18, SGD with learning rate of .05, momentum of .9, and weight decay of 1e-4. The model is trained for 100 epochs, and the learning rate is divided by 10 every 30 epochs. ... For our watermark models, we select σ of 1, replay count of 20, and noise sample count of 100. ... To attack the model, we used Adam with learning rates of .1, .001 or .0001 for 50 epochs.