Private Federated Learning with Autotuned Compression
Authors: Enayat Ullah, Christopher A. Choquette-Choo, Peter Kairouz, Sewoong Oh
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach on real-world datasets by achieving favorable compression rates without the need for tuning. In this section, we present experimental evaluation of the methods proposed in Sec. 4 for federated optimization, in standard FL benchmarks. |
| Researcher Affiliation | Collaboration | Enayat Ullah * 1 2 Christopher A. Choquette-Choo * 3 Peter Kairouz * 3 Sewoong Oh * 3 4 1The Johns Hopkins University 2Work completed while on internship at Google. 3Google Research 4University of Washington. |
| Pseudocode | Yes | Algorithm 1 Adapt Norm FME, Algorithm 2 Adapt Tail FME, Algorithm 3 Adapt Norm FL, Algorithm 4 Two Stage FL, Algorithm 5 Adapt Tail FL |
| Open Source Code | No | The paper does not provide a specific link or explicit statement about releasing the source code for the methodology described. |
| Open Datasets | Yes | We map our mean estimation technique to the Fed Avg algorithm and test it on three standard FL benchmark tasks: character/digit recognition task on the F-EMNIST dataset and next word prediction on Shakespeare and Stackoverflow datasets (see Sec. 5 for details). |
| Dataset Splits | Yes | We define to be the max allowed relative drop in utility (validation accuracy) when compared to their baseline without compression. Table 1. Adapt Norm is stable with respect to choices in the relative error constant c0. ... Validation Accuracy |
| Hardware Specification | No | The paper mentions runtime benchmarks ("standard DP-Fed Avg takes 3.63s/round..."), but does not specify the underlying hardware (e.g., CPU/GPU models, memory) used for these experiments. |
| Software Dependencies | No | The paper describes algorithms and refers to prior work for setup, but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We follow the exact same setup (model architectures, hyper parameters) as Chen et al. (2022a) except where noted below. Our full description can be found in App. F. ... For F-EMNIST, the server uses momentum of 0.9 and η = 0.49 with the client using a learning rate of 0.01 without momentum and a mini-batch size of 20. Our optimal server learning rates are {0.6, 0.4, 0.2, 0.1, 0.08} for noise multipliers in {0.1, 0.2, 0.3, 0.5, 0.7}, respectively. |