Unleashing the Power of Randomization in Auditing Differentially Private ML
Authors: Krishna Pillutla, Galen Andrew, Peter Kairouz, H. Brendan McMahan, Alina Oprea, Sewoong Oh
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We audit an unknown Gaussian mechanism with black-box access and demonstrate (up to) 16 improvement in the sample complexity. We also show how to seamlessly lift recently proposed canary designs in our recipe to improve the sample complexity on real data. (Section 1) and We compare the proposed Li DP auditing recipe relative to the standard one for DP training of machine learning models. Our code is available online.1 (Section 6) |
| Researcher Affiliation | Collaboration | Krishna Pillutla1 Galen Andrew1 Peter Kairouz1 H. Brendan Mc Mahan1 Alina Oprea1,2 Sewoong Oh1,3 1Google Research 2Northeastern University 3University of Washington |
| Pseudocode | Yes | Algorithm 1 Auditing Lifted DP |
| Open Source Code | Yes | Our code is available online.1 (Section 6) and the footnote '1https://github.com/google-research/federated/tree/master/lidp_auditing' |
| Open Datasets | Yes | We test with two classification tasks: FMNIST [65] is a 10-class grayscale image classification dataset, while Purchase-100 is a sparse tabular dataset with 600 binary features and 100 classes [19, 53]. (Section 6) and FMNIST: FMNIST or Fashion MNIST [65]... The dataset is available under the MIT license. (Section F.1) and Purchase-100:... The dataset is available publicly on Kaggle but the owners have not created a license as far as we could tell. (Section F.1) |
| Dataset Splits | No | The paper mentions '60K train images and 10K test images' for FMNIST and '20K training points and 5K test points' for Purchase-100. It also states 'maximize the validation accuracy' and refers to 'Dval' as a held-out dataset, but does not provide explicit sizes or proportions for the validation split, which is necessary for reproducibility. |
| Hardware Specification | Yes | Hardware. We run each job on an internal compute cluster using only CPUs (i.e., no hardware accelerators such as GPUs were used). Each job was run with 8 CPU cores and 16G memory. (Section F.3) |
| Software Dependencies | No | The paper mentions general software components like 'DP-SGD [1]', 'cross-entropy loss', and 'stochastic gradient descent'. However, it does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python version, TensorFlow/PyTorch version, scikit-learn version). |
| Experiment Setup | Yes | We train each model for 30 epochs with a batch size of 100 and a fixed learning rate of 0.02 for the linear model and 0.01 for the MLP. (F.1 FMNIST) and The model is a MLP with 2 hidden layers of 256 units each. It is trained for 100 epochs with a batch size of 100 and a fixed learning rate of 0.05. (F.1 Purchase-100) |