reproducibilityindex.ai

Unleashing the Power of Randomization in Auditing Differentially Private ML

Authors: Krishna Pillutla, Galen Andrew, Peter Kairouz, H. Brendan McMahan, Alina Oprea, Sewoong Oh

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We audit an unknown Gaussian mechanism with black-box access and demonstrate (up to) 16 improvement in the sample complexity. We also show how to seamlessly lift recently proposed canary designs in our recipe to improve the sample complexity on real data. (Section 1) and We compare the proposed Li DP auditing recipe relative to the standard one for DP training of machine learning models. Our code is available online.1 (Section 6)
Researcher Affiliation	Collaboration	Krishna Pillutla1 Galen Andrew1 Peter Kairouz1 H. Brendan Mc Mahan1 Alina Oprea1,2 Sewoong Oh1,3 1Google Research 2Northeastern University 3University of Washington
Pseudocode	Yes	Algorithm 1 Auditing Lifted DP
Open Source Code	Yes	Our code is available online.1 (Section 6) and the footnote '1https://github.com/google-research/federated/tree/master/lidp_auditing'
Open Datasets	Yes	We test with two classiﬁcation tasks: FMNIST [65] is a 10-class grayscale image classiﬁcation dataset, while Purchase-100 is a sparse tabular dataset with 600 binary features and 100 classes [19, 53]. (Section 6) and FMNIST: FMNIST or Fashion MNIST [65]... The dataset is available under the MIT license. (Section F.1) and Purchase-100:... The dataset is available publicly on Kaggle but the owners have not created a license as far as we could tell. (Section F.1)
Dataset Splits	No	The paper mentions '60K train images and 10K test images' for FMNIST and '20K training points and 5K test points' for Purchase-100. It also states 'maximize the validation accuracy' and refers to 'Dval' as a held-out dataset, but does not provide explicit sizes or proportions for the validation split, which is necessary for reproducibility.
Hardware Specification	Yes	Hardware. We run each job on an internal compute cluster using only CPUs (i.e., no hardware accelerators such as GPUs were used). Each job was run with 8 CPU cores and 16G memory. (Section F.3)
Software Dependencies	No	The paper mentions general software components like 'DP-SGD [1]', 'cross-entropy loss', and 'stochastic gradient descent'. However, it does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python version, TensorFlow/PyTorch version, scikit-learn version).
Experiment Setup	Yes	We train each model for 30 epochs with a batch size of 100 and a ﬁxed learning rate of 0.02 for the linear model and 0.01 for the MLP. (F.1 FMNIST) and The model is a MLP with 2 hidden layers of 256 units each. It is trained for 100 epochs with a batch size of 100 and a ﬁxed learning rate of 0.05. (F.1 Purchase-100)