Margin-based Neural Network Watermarking

Authors: Byungjoo Kim, Suyoung Lee, Seanie Lee, Sooel Son, Sung Ju Hwang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our method on multiple benchmarks and show that our watermarking method successfully defends against model extraction attacks, outperforming relevant baselines. We validate our margin-based watermarking method against functionality stealing methods extraction and distillation, showing that it significantly outperforms previous methods in terms of both clean and watermarking accuracy. We empirically show that our method is robust to how we construct trigger sets, which makes it challenging for adversaries to identify the trigger set. 4. Experiments 4.1. Experimental Setup 4.2. Results on Functionality Stealing
Researcher Affiliation Academia Byungjoo Kim 1 Suyoung Lee 2 Seanie Lee 1 Sooel Son 2 Sung Ju Hwang 1 1Kim Jaechul Graduate School of AI, KAIST 2Graduate School of Information Security, KAIST.
Pseudocode Yes Algorithm 1 Margin-based watermarking
Open Source Code Yes The source codes are available at here. The source codes are available at: https://github.com/matbambbang/margin-based-watermarking.
Open Datasets Yes We use the CIFAR-10 or CIFAR-100 dataset (Krizhevsky et al., 2009) as D and split the training set of the CIFAR-10 dataset into a train set Db, and a validation set.
Dataset Splits Yes We use the CIFAR-10 or CIFAR-100 dataset (Krizhevsky et al., 2009) as D and split the training set of the CIFAR-10 dataset into a train set Db, and a validation set.
Hardware Specification No The paper mentions using "Res Net-34 for all the watermarked model" but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions "Res Net-34" for models and "SGD optimizer" but does not specify software dependencies with version numbers (e.g., PyTorch version, Python version, CUDA version).
Experiment Setup Yes We use Res Net-34 for all the watermarked model with SGD optimizer, learning rate of 0.1, weight decay of 0.0001, learning rate decay of 0.1 for 100 and 150 epoch. The training was done for 200 epochs. For distillation and extraction, we used the same optimizer of training the source model. When varying the number of iterations K in Algorithm 1 and Equation 3, we also vary the maximum perturbation size ϵ, for instance, when K = 3, ϵ = 3/255. For our margin-based watermarking method, we set the trigger set size and batch size of the trigger set to 100 and 25, respectively. We perform inner maximization in Equation (4) for 5 steps with a step size η1 = 1/255 and ϵ = 5/255. For attack with regularized by ground truth label Equation (5), we set α to 0.3.