Scalable Private Learning with PATE

Authors: Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Ulfar Erlingsson

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation shows our mechanisms improve on the original PATE on all measures, and scale to larger tasks with both high utility and very strong privacy (ε < 1.0). Section 5 details our experimental evaluation;
Researcher Affiliation Collaboration Nicolas Papernot Pennsylvania State University ngp5056@cse.psu.edu Shuang Song University of California San Diego shs037@eng.ucsd.edu Ilya Mironov, Ananth Raghunathan, Kunal Talwar & Úlfar Erlingsson Google Brain {mironov,pseudorandom,kunal,ulfar}@google.com
Pseudocode Yes Algorithm 1 Confident-GNMax Aggregator: given a query, consensus among teachers is first estimated in a privacy-preserving way to then only reveal confident teacher predictions. Input: input x, threshold T, noise parameters σ1 and σ2 1: if maxi{nj(x)} + N(0, σ2 1) T then Privately check for consensus 2: return argmaxj nj(x) + N(0, σ2 2) Run the usual max-of-Gaussian 3: else 4: return 5: end if
Open Source Code Yes The source code for the privacy analysis in this paper as well as supporting data required to run this analysis is available on Github.1 1https://github.com/tensorflow/models/tree/master/research/differential_privacy
Open Datasets No Glyph data is not public but similar data is available publicly as part of the not MNIST dataset.
Dataset Splits No The test set is split in two halves: the first is used as unlabeled inputs to simulate the student s public data and the second is used as a hold out to evaluate test performance. ... We split holdout data in two subsets of 100K and 400K samples: the first acts as public data to train the student and the second as its testing data.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or other system specifications used for experiments.
Software Dependencies No The MNIST and SVHN students are convolutional networks trained using semi-supervised learning with GANs à la Salimans et al. (2016). The student for the Adult dataset are fully supervised random forests. ... The student architecture is a convolutional network learnt in a semi-supervised fashion with virtual adversarial training (VAT) from Miyato et al. (2017).
Experiment Setup Yes Each teacher is a ResNet (He et al., 2016) made of 32 leaky ReLU layers. We train on batches of 100 inputs for 40K steps using SGD with momentum. The learning rate, initially set to 0.1, is decayed after 10K steps to 0.01 and again after 20K steps to 0.001. These parameters were found with a grid search. ... We train with Adam for 400 epochs and a learning rate of 6 10-5.