Private Federated Learning with Autotuned Compression

Authors: Enayat Ullah, Christopher A. Choquette-Choo, Peter Kairouz, Sewoong Oh

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our approach on real-world datasets by achieving favorable compression rates without the need for tuning. In this section, we present experimental evaluation of the methods proposed in Sec. 4 for federated optimization, in standard FL benchmarks.
Researcher Affiliation Collaboration Enayat Ullah * 1 2 Christopher A. Choquette-Choo * 3 Peter Kairouz * 3 Sewoong Oh * 3 4 1The Johns Hopkins University 2Work completed while on internship at Google. 3Google Research 4University of Washington.
Pseudocode Yes Algorithm 1 Adapt Norm FME, Algorithm 2 Adapt Tail FME, Algorithm 3 Adapt Norm FL, Algorithm 4 Two Stage FL, Algorithm 5 Adapt Tail FL
Open Source Code No The paper does not provide a specific link or explicit statement about releasing the source code for the methodology described.
Open Datasets Yes We map our mean estimation technique to the Fed Avg algorithm and test it on three standard FL benchmark tasks: character/digit recognition task on the F-EMNIST dataset and next word prediction on Shakespeare and Stackoverflow datasets (see Sec. 5 for details).
Dataset Splits Yes We define to be the max allowed relative drop in utility (validation accuracy) when compared to their baseline without compression. Table 1. Adapt Norm is stable with respect to choices in the relative error constant c0. ... Validation Accuracy
Hardware Specification No The paper mentions runtime benchmarks ("standard DP-Fed Avg takes 3.63s/round..."), but does not specify the underlying hardware (e.g., CPU/GPU models, memory) used for these experiments.
Software Dependencies No The paper describes algorithms and refers to prior work for setup, but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We follow the exact same setup (model architectures, hyper parameters) as Chen et al. (2022a) except where noted below. Our full description can be found in App. F. ... For F-EMNIST, the server uses momentum of 0.9 and η = 0.49 with the client using a learning rate of 0.01 without momentum and a mini-batch size of 20. Our optimal server learning rates are {0.6, 0.4, 0.2, 0.1, 0.08} for noise multipliers in {0.1, 0.2, 0.3, 0.5, 0.7}, respectively.