cpSGD: Communication-efficient and differentially-private distributed SGD

Authors: Naman Agarwal, Ananda Theertha Suresh, Felix Xinnan X. Yu, Sanjiv Kumar, Brendan McMahan

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. At each step 10,000 clients send their data to the server. The results are in Figure 2.
Researcher Affiliation Industry Naman Agarwal Google Brain Princeton, NJ 08540 namanagarwal@google.com Ananda Theertha Suresh Google Research New York, NY theertha@google.com Felix Yu Google Research New York, NY felixyu@google.com Sanjiv Kumar Google Research New York, NY sanjivk@google.com H. Brendan Mc Mahan Google Research Seattle, WA mcmahan@google.com
Pseudocode No The paper describes the steps of its mechanisms but does not provide a formal pseudocode block or algorithm box.
Open Source Code No The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets Yes We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients.
Dataset Splits No The paper mentions using the 'infinite MNIST dataset' and training for 'one epoch', but it does not specify explicit train/validation/test splits, percentages, or sample counts.
Hardware Specification No The paper refers to 'mobile devices' as clients, but it does not specify the hardware (e.g., GPU models, CPU types) used by the authors to run their experiments.
Software Dependencies No The paper cites 'Tensorflow' [1] as a basic building block, but it does not specify a version number for TensorFlow or any other software dependencies used in their implementation.
Experiment Setup Yes We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. At each step 10,000 clients send their data to the server... k is the number of quantization levels, and m is the parameter of the binomial noise (p = 0.5, s = 1). The baseline is without quantization and differential privacy. δ = 10 9. We note that we trained the model with exactly one epoch.