reproducibilityindex.ai

Fast and Memory Efficient Differentially Private-SGD via JL Projections

Authors: Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Hanwen Shen, Uthaipon Tantipongpipat

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we demonstrate experimentally that compared to existing implementations of DP-SGD with exact per-sample gradient clipping, our optimizers have signiﬁcant advantages in speed and memory cost while achieving comparable accuracy-vs-privacy tradeoff.
Researcher Affiliation	Collaboration	Zhiqi Bu University of Pennsylvania zbu@sas.upenn.edu Sivakanth Gopi Microsoft Research sigopi@microsoft.com Janardhan Kulkarni Microsoft Research jakul@microsoft.com Yin Tat Lee University of Washington yintat@uw.edu Judy Hanwen Shen Stanford University jhshen@stanford.edu Uthaipon Tantipongpipat Twitter uthaipon@gmail.com
Pseudocode	Yes	Algorithm 1: Differentially private SGD using JL projections (DP-SGD-JL)
Open Source Code	Yes	The code for our experiments is available in the supplementary material.
Open Datasets	Yes	on the IMDb dataset for sentiment analysis. We train the same single-layer bidirectional LSTM7 as in the [Ten] tutorial, using the same IMDb dataset with 8k vocabulary. We train a convolutional neural network from [TP] tutorial on MNIST dataset, which has 60,000 training samples.
Dataset Splits	No	The paper mentions training data size (e.g., '25,000 training samples' for IMDb, '60,000 training samples' for MNIST) but does not provide explicit train/validation/test dataset splits or their percentages/counts.
Hardware Specification	Yes	We use one Tesla P100 16GB GPU for all experiments.
Software Dependencies	Yes	We use Tensorﬂow and [TP] for all our experiments because [Opa] does not support arbitrary network architectures. Moreover Tensorﬂow has an efﬁcient implementation of jvp while Py Torch doesn t. Supported in tf-nightly 2.4.0.dev20200924 as tf.autodiff.Forward Accumulator(θ,v).jvp(F). JAX also has an implementation of jvp.
Experiment Setup	Yes	We set β1 = 0.9, β2 = 0.999, σ = 0.6, C = 1, B = 256, η = 0.001, E = 15. We use Adam as the optimizer.