cpSGD: Communication-efficient and differentially-private distributed SGD
Authors: Naman Agarwal, Ananda Theertha Suresh, Felix Xinnan X. Yu, Sanjiv Kumar, Brendan McMahan
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. At each step 10,000 clients send their data to the server. The results are in Figure 2. |
| Researcher Affiliation | Industry | Naman Agarwal Google Brain Princeton, NJ 08540 namanagarwal@google.com Ananda Theertha Suresh Google Research New York, NY theertha@google.com Felix Yu Google Research New York, NY felixyu@google.com Sanjiv Kumar Google Research New York, NY sanjivk@google.com H. Brendan Mc Mahan Google Research Seattle, WA mcmahan@google.com |
| Pseudocode | No | The paper describes the steps of its mechanisms but does not provide a formal pseudocode block or algorithm box. |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. |
| Dataset Splits | No | The paper mentions using the 'infinite MNIST dataset' and training for 'one epoch', but it does not specify explicit train/validation/test splits, percentages, or sample counts. |
| Hardware Specification | No | The paper refers to 'mobile devices' as clients, but it does not specify the hardware (e.g., GPU models, CPU types) used by the authors to run their experiments. |
| Software Dependencies | No | The paper cites 'Tensorflow' [1] as a basic building block, but it does not specify a version number for TensorFlow or any other software dependencies used in their implementation. |
| Experiment Setup | Yes | We trained a three-layer model (60 hidden nodes each with Re LU activation) on the infinite MNIST dataset [8] with 25M data points and 25M clients. At each step 10,000 clients send their data to the server... k is the number of quantization levels, and m is the parameter of the binomial noise (p = 0.5, s = 1). The baseline is without quantization and differential privacy. δ = 10 9. We note that we trained the model with exactly one epoch. |