reproducibilityindex.ai

Probabilistically Robust Watermarking of Neural Networks

Authors: Mikhail Pautov, Nikita Bogdanov, Stanislav Pyatkin, Oleg Rogov, Ivan Oseledets

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on multiple benchmarks and show that our approach outperforms current state-of-the-art watermarking techniques in all considered experimental setups. 5 Experiments
Researcher Affiliation	Collaboration	Mikhail Pautov1,2,3 , Nikita Bogdanov2 , Stanislav Pyatkin2 , Oleg Rogov1,2 and Ivan Oseledets1,2 1Artificial Intelligence Research Institute, Moscow, Russia 2Skolkovo Institute of Science and Technology, Moscow, Russia 3ISP RAS Research Center for Trusted Artificial Intelligence, Moscow, Russia {mikhail.pautov, nikita.bogdanov, stanislav.pyatkin}@skoltech.ru, {rogov, oseledets}@airi.net
Pseudocode	Yes	Algorithm 1 Trigger set candidate Input: Hold-out dataset Dh, source model f Output: Trigger set candidate (x , y ) 1: while True do 2: Sample (x1, y1), (x2, y2) U(Dh) 3: if y1 = y2 then 4: Sample λ U(0, 1) 5: x = λx1 + (1 λ)x2 6: y = f(x ) 7: if y = y1 and y = y2 then 8: return (x , y ) 9: end if 10: end if 11: end while
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	Yes	In our experiments, we use CIFAR-10 and CIFAR-100 [Krizhevsky et al., 2009] as training datasets for our source model f.
Dataset Splits	No	The paper mentions using CIFAR-10 and CIFAR-100 as training datasets and a 'hold-out test data Dh' but does not specify explicit training, validation, and test dataset splits (e.g., percentages or exact sample counts for each split).
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions using an 'SGD optimizer' but does not provide specific software details like library names with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions) needed to replicate the experiment environment.
Experiment Setup	Yes	We used SGD optimizer with learning rate of 0.1, weight decay of 0.5 10 3 and momentum of 0.9. Namely, parameter δ was varied in the range [0.5, 40] and τ was chosen from the set {0.1, 0.2, 1.0}. We tested different number of proxy models sampled from Bδ,τ(f) for verification. Namely, parameter m was chosen from the set {1, 2, 4, 8, 16, 32, 64, 128, 256}. Unless stated otherwise, we use the following values of hyperparameters in our experiments: the size of the verified trigger set n is set to be n = 100 for consistency with the concurrent works, confidence level α for Clopper-Pearson test from Eq. (9) is set to be α = 0.05. In our experiments, we found that better transferability of the verified trigger set is achieved when no constraint on the performance of the proxy models is applied, so the performance threshold parameter is set to be τ = 1.0. According to parameters tuning, we choose m = 64 and δ = 40.0 as the default parameters of the proxy set. In Table 4, we report the values of parameters we used in each experiment.