Deep Neural Network Fingerprinting by Conferrable Adversarial Examples

Authors: Nils Lukas, Yuxuan Zhang, Florian Kerschbaum

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present an extensive study on the irremovability of our fingerprint against finetuning, weight pruning, retraining, retraining with different architectures, three model extraction attacks from related work, transfer learning, adversarial training, and two new adaptive attacks. Our fingerprint is the first method that reaches a ROC AUC of 1.0 in verifying surrogates, compared to a ROC AUC of 0.63 by previous fingerprints.
Researcher Affiliation Academia Nils Lukas, Yuxuan Zhang, Florian Kerschbaum University of Waterloo {nlukas, y2536zhang, florian.kerschbaum}@uwaterloo.ca
Pseudocode No The paper describes algorithms and methods in prose and mathematical equations but does not include any explicitly labeled pseudocode blocks or algorithm figures.
Open Source Code No The paper mentions re-using existing implementations like "Adversarial Robustness Toolbox v1.3.1" and "Tensorflow Privacy", and states "All remaining attacks and IPGuard (Cao et al., 2019) are re-implemented from scratch." However, it does not provide a link or explicit statement that *their own* implementation code for the methodology described in the paper is open-source or publicly available.
Open Datasets Yes We train popular CNN architectures on CIFAR-10 without any modification to the standard training process. Attacker Datasets. We distinguish between attackers with access to different datasets, which are CIFAR-10 (Krizhevsky et al.), CINIC (Darlow et al., 2018) and Image Net32 (Chrabaszcz et al., 2017).
Dataset Splits No The paper mentions using training data and test samples (e.g., "CIFAR-10 test samples"), and refers to "validation of a model" in a general sense, but it does not specify a distinct validation dataset split (e.g., percentages or counts for a validation set) used for hyperparameter tuning or early stopping in their experimental setup.
Hardware Specification Yes We train all models on a server running Ubuntu 18.04 in 64-bit mode using four Tesla P100 GPUs and 128 cores of an IBM POWER8 CPU (2.4GHz) with up to 1TB of accessible RAM.
Software Dependencies Yes The machine learning is implemented in Keras using the Tensorflow v2.2 backend, and datasets are loaded using Tensorflow Dataset. We re-use implementations of existing adversarial attacks from the Adversarial Robustness Toolbox v1.3.1 (Nicolae et al., 2018). DP-SGD (Abadi et al., 2016) is implemented through Tensorflow Privacy (Galen Andrew, Steve Chien, and Nicolas Papernot, 2019).
Experiment Setup Yes For the generated adversarial examples using attacks like FGM, PGD and CW-L , we use the following parametrization. We limit the maximum number of iterations in PGD to 10 and use a stepsize of 0.01. For FGM, we use a step-size of ϵ and for CW-L we limit the number of iterations to 50, use a confidence parameter k = 0.5, a learning rate of 0.01 and a step-size of 0.01. In all our experiments, we use weights α = β = γ = 1 and refer to Appendix A.6 for an empirical sensitivity analysis of the hyperparameters. We use Dropout (Srivastava et al., 2014) with a drop ratio of d = 0.3.