The importance of feature preprocessing for differentially private linear optimization

Authors: Ziteng Sun, Ananda Theertha Suresh, Aditya Krishna Menon

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically demonstrate our findings by evaluating DPSGD and our proposed featurenormalized DPSGD (DPSGD-F6) for the task of training a linear classifier on three popular image classification datasets: (1) MNIST (Lecun et al., 1998); (2) Fashion-MNIST (Xiao et al., 2017); (3) CIFAR-100 (Krizhevsky et al., 2009) with pretrained features. Our results are listed in Table 1. Our proposed algorithm consistently improves upon DPSGD for all three datasets under ε = 1 and ε = 2, which empirically demonstrates our theoretical findings.
Researcher Affiliation Industry Ziteng Sun, Ananda Theertha Suresh, Aditya Krishna Menon Google Research, New York {zitengsun,theertha,adityakmenon}@google.com
Pseudocode Yes Algorithm 1 Differentially private SGD (Abadi et al., 2016), Algorithm 2 DPSGD with feature preprocessing (DPSGD-F)., Algorithm 3 Modefied version of DPSGD with feature preprocessing (DPSGD-F ).
Open Source Code No The paper mentions 'JAX (Bradbury et al., 2018) library' and 'Tensorflow Privacy (Google, 2018)' and provides URLs for these third-party tools. It does not state that the authors are releasing their own code for the specific methodology described in this paper.
Open Datasets Yes We empirically demonstrate our findings by evaluating DPSGD and our proposed featurenormalized DPSGD (DPSGD-F6) for the task of training a linear classifier on three popular image classification datasets: (1) MNIST (Lecun et al., 1998); (2) Fashion-MNIST (Xiao et al., 2017); (3) CIFAR-100 (Krizhevsky et al., 2009) with pretrained features.
Dataset Splits No The paper mentions using standard datasets (MNIST, Fashion-MNIST, CIFAR-100) but does not explicitly provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or citations to specific predefined splits). It only states they used a grid search for parameters which implies a validation process, but the split details are missing.
Hardware Specification No The paper does not provide any specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper states: 'We implement all algorithms and experiments using the open-source JAX (Bradbury et al., 2018) library. For privacy accounting, we use the PLD accountant implemented in Tensorflow Privacy (Google, 2018).' However, it does not provide specific version numbers for JAX or Tensorflow Privacy, which are necessary for full reproducibility.
Experiment Setup Yes Detailed implementation and parameter settings are listed in Appendix C. Appendix C states: 'For each combination of (ALGORITHM, ε, DATASET), we fix the clipping norm CG to be 1 as suggested in De et al. (2022), and perform a grid search over CF , BATCH SIZE, LEARNING RATE, and NUMBER OF EPOCHS from the list below... 1. CF : 1, 10, 100, 1000. 2. BATCH SIZE: 256, 512, 1024, 2048, 4096, 8192, 16384. 3. LEARNING RATE: 0.03125, 0.0625, 0.125, 0.25, 0.5, 1, 2, 4, 8, 16. 4. NUMBER OF EPOCHS: 20, 40, 80, 160, 320.'