Dual Student Networks for Data-Free Model Stealing

Authors: James Beetham, Navid Kardan, Ajmal Saeed Mian, Mubarak Shah

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the Dual Students method on various datasets for standard classification accuracy and transfer-based attack effectiveness. The datasets used are MNIST, Fashion MNIST, GTSRB, SVHN, CIFAR10, and CIFAR100 (Xiao et al., 2017; Stallkamp et al., 2011; Netzer et al., 2011; Krizhevsky et al., 2009). The target model architecture is Res Net-34, while the students are Res Net-18 and are trained for 2 million queries for MNIST, Fashion MNIST, GTSRB, and SVHN and 20 million queries for CIFAR10 and CIFAR100 (He et al., 2016).
Researcher Affiliation Academia James Beetham1 , Navid Kardan1 , Ajmal Mian2, Mubarak Shah1 1Center for Research in Computer Vision University of Central Florida 2Department of Computer Science University of Western Australia
Pseudocode Yes We provide an algorithm for our method in the supplemental materials.
Open Source Code No In the supplemental material we provide the algorithm as well as additional hyperparameters necessary for replicating the results shown in the tables.
Open Datasets Yes The datasets used are MNIST, Fashion MNIST, GTSRB, SVHN, CIFAR10, and CIFAR100 (Xiao et al., 2017; Stallkamp et al., 2011; Netzer et al., 2011; Krizhevsky et al., 2009).
Dataset Splits No The target model architecture is Res Net-34, while the students are Res Net-18 and are trained for 2 million queries for MNIST, Fashion MNIST, GTSRB, and SVHN and 20 million queries for CIFAR10 and CIFAR100 (He et al., 2016).
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers were provided in the paper.
Experiment Setup Yes The target model architecture is Res Net-34, while the students are Res Net-18 and are trained for 2 million queries for MNIST, Fashion MNIST, GTSRB, and SVHN and 20 million queries for CIFAR10 and CIFAR100 (He et al., 2016). The generator architecture is the same 3layer convolutional model used in DFME. The student loss used for soft-labels is LS = ℓ1, whereas the student loss used for hard-labels is cross entropy, i.e. LS = ce, as this is standard for classification tasks. The generator optimization maximizes the difference between the outputs of the two students, and LG = ℓ1 is used in the generator loss when training on both soft-labels and hard-labels. More experiment details are provided in the supplemental material.