Auditing Differentially Private Machine Learning: How Private is Private SGD?

Authors: Matthew Jagielski, Jonathan Ullman, Alina Oprea

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For every dataset and model, we find that Clip BKD significantly outperforms MI, by a factor of between 2.5x and 1500x. As a representative example, for εth = 4 on Purchase-100 with 2-layer neural networks, Clip BKD gives an εLB of 0.46, while MI gives εLB of 0.04, an improvement of 12.1x. We also find Clip BKD always improves over standard backdoors: on FMNIST by an average factor of 3.84x, and standard backdoors never reach positive εLB on CIFAR, due to the large number of points required to poison the pretrained model.
Researcher Affiliation Academia Matthew Jagielski Northeastern University jagielski@ccs.neu.edu Jonathan Ullman Northeastern University jullman@northeastern.edu Alina Oprea Northeastern University a.oprea@northeastern.edu
Pseudocode Yes Algorithm 1: DP-SGD Data: Input: Clipping norm C, noise magnitude σ, iteration count T, batch size b, dataset D, initial model parameters θ0, learning rate η For i [T] G = 0 For (x, y) batch of b random elements of D g = θℓ(θi; (x, y)) G = G + b 1g min(1, C||g|| 1 2 ) θi = θi 1 η(G + N(0, (Cσ)2I)) Return θT
Open Source Code No No explicit statement or link providing concrete access to source code for the methodology described in this paper was found. The paper mentions using 'Tensor Flow Privacy [Goo]' but does not provide their own code.
Open Datasets Yes We evaluate both membership inference (MI, as used by [YGFJ18] and [JE19] and described in Appendix D) and our algorithms on three datasets: Fashion-MNIST (FMNIST), CIFAR10, and Purchase-100 (P100). FMNIST [XRV17] is a dataset of 70000 28x28 pixel images of clothing from one of 10 classes, split into a train set of 60000 images and a test set of 10000 images. It is a standard benchmark dataset for differentially private machine learning. CIFAR10 [Kri09] is a harder dataset than FMNIST, consisting of 60000 32x32x3 images of vehicles and animals, split into a train set of 50000 and a test set of 10000. P100 [SSSS17] is a modification of a Kaggle dataset [Pur], with 200000 records of 100 features, and 100 classes.
Dataset Splits No The paper describes train and test splits for datasets (e.g., 'FMNIST... split into a train set of 60000 images and a test set of 10000 images'), but does not explicitly state a validation set split or provide specific details for a three-way train/validation/test split.
Hardware Specification No No specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running experiments were provided in the paper.
Software Dependencies No The paper mentions 'TensorFlow Privacy [Goo]' but does not provide a specific version number for this or any other software dependency.
Experiment Setup Yes Table 1: Training details for experiments in Section 4. P100 regularization is 10 5 for logistic regression and 10 4 for neural networks, following [JE19].