reproducibilityindex.ai

Dataset Inference for Self-Supervised Models

Authors: Adam Dziedzic, Haonan Duan, Muhammad Ahmad Kaleem, Nikita Dhawan, Jonas Guan, Yannis Cattan, Franziska Boenisch, Nicolas Papernot

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive empirical results in the vision domain demonstrate that dataset inference is a promising direction for defending self-supervised models against model stealing.
Researcher Affiliation	Academia	University of Toronto and Vector Institute
Pseudocode	Yes	Algorithm 1 summarizes the stealing approach used by an adversary.
Open Source Code	No	The paper references 'an open-source Py Torch implementation of Sim CLR 3' (https://github.com/kuangliu/pytorch-cifar), but this is a third-party tool used by the authors, not their own source code for the proposed defense.
Open Datasets	Yes	We evaluate our defense against encoder extraction attacks using ﬁve different vision datasets (CIFAR10, CIFAR100 [28], SVHN [34], STL10 [8], and Image Net [11]).
Dataset Splits	Yes	For SVHN, we merge the original training and test splits, and use the randomly-selected 80% as the training set and the rest 20% as the test set. For SVHN and CIFAR10, we use 50% of the training set to train GMMs, and the remaining for evaluation.
Hardware Specification	No	The paper does not provide specific details on the GPU or CPU models used for the experiments, or any other hardware specifications.
Software Dependencies	No	The paper mentions using a 'Py Torch implementation' but does not specify its version number or any other software dependencies with version details.
Experiment Setup	Yes	We train GMMs with 10 components for SVHN and CIFAR10, and 50 components for Image Net. In general, we observe that the larger number of components for GMMs, the better the defense is. For Image Net, we restrict the covariance matrix to be diagonal for efﬁciency. For CIFAR10 and SVHN, we use the full covariance matrix.