The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning
Authors: Wei-Ning Chen, Christopher A Choquette Choo, Peter Kairouz, Ananda Theertha Suresh
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we evaluate our proposed scheme on real-world federated learning tasks. We find that our theoretical analysis is well matched in practice. |
| Researcher Affiliation | Collaboration | 1Stanford University 2Google Research. |
| Pseudocode | Yes | Algorithm 1 The DDG mechanism; Algorithm 2 Private DME with random projection; Algorithm 3 Gradient Count-Mean Sketch Encoding.; Algorithm 4 Gradient Count-Mean Sketch Decoding; Algorithm 5 Distributed discrete Gaussian mechanism DDGenc (with detailed parameters); Algorithm 6 sparse DME via Compressed Sensing |
| Open Source Code | Yes | The code is available at https: //github.com/google-research/federated/ tree/master/private_linear_compression. |
| Open Datasets | Yes | We run experiments on the full Federated EMNIST, Stack Overflow dataset, and Shakespare three common benchmarks for FL taskss (Caldas et al., 2018). |
| Dataset Splits | Yes | F-EMNIST has 62 classes and N = 3400 clients, with each user holding both a train and test set of examples. In total, there are 671, 585 training examples and 77, 483 test examples. |
| Hardware Specification | No | The paper mentions running experiments and training models (e.g., '10^6 parameter CNN', '4x10^6 parameter LSTM model') but does not specify any hardware details such as GPU/CPU models, memory, or cloud instances used. |
| Software Dependencies | No | The paper describes training procedures (e.g., 'SGD', 'geometric adaptive clipping', 'Discrete Fourier Transform') and hyperparameter values, but it does not list specific software libraries or frameworks with their version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x'). |
| Experiment Setup | Yes | On F-EMNIST, we use a server learning rate of 1. normalized by n (the number of clients) and momentum of 0.9 (Polyak, 1964); the client uses a learning rate of 0.01 without momentum. On Stack Overflow, we use a server learning rate of 1.78 normalized by n and momentum of 0.9; the client uses a learning rate of 0.3. For distributed DP, we use the geometric adaptive clipping of (Andrew et al., 2019) with an initial ℓ2 clipping norm of 0.1 and a target quantile of 0.5. We use the same procedure as (Kairouz et al., 2021a) and flatten using the Discrete Fourier Transform, pick β = exp ( 0.5) as the conditional randomized rounding bias, and use a modular clipping target probability of 6.33e 5 or 4 standard deviations at the server (assuming normally distributed updates). We communicate 16 bits per parameter for F-EMNIST and 18 bits for SONWP unless otherwise indicated. |