Accelerating Federated Learning with Quick Distributed Mean Estimation

Authors: Ran Ben-Basat, Shay Vargaftik, Amit Portnoy, Gil Einziger, Yaniv Ben-Itzhak, Michael Mitzenmacher

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using various datasets and training tasks, we demonstrate how QUIC-FL achieves state of the art accuracy with faster encoding and decoding times compared to other DME methods.
Researcher Affiliation Collaboration 1University College London 2VMware Research 3Ben-Gurion University of the Negev 4Harvard University.
Pseudocode Yes The pseudo-code of QUIC-FL appears in Algorithm 1.
Open Source Code Yes Our code is released as open source (Ben Basat et al., 2024). Code available at: https://github.com/amitport/QUICFL-Quick-Unbiased-Compression-for Federated-Learning.
Open Datasets Yes We evaluate QUIC-FL over the Shakespeare next-word prediction task (Shakespeare; Mc Mahan et al., 2017) citing Shakespeare, W. The Complete Works of William Shakespeare. https://www.gutenberg.org/ ebooks/100. and We evaluate QUIC-FL against other schemes with 10 persistent clients over uniformly distributed CIFAR-10 and CIFAR-100 datasets (Krizhevsky et al., 2009).
Dataset Splits No The paper mentions using specific datasets (Shakespeare, CIFAR-10, CIFAR-100) and refers to external setups (Reddi et al., 2021), but does not explicitly state the train/validation/test dataset splits within its text.
Hardware Specification Yes using an NVIDIA 3080 RTX GPU machine with 32GB RAM and i7-10700K CPU @ 3.80GHz.
Software Dependencies No The paper mentions using Py Torch, Tensor Flow, Gekko, APMonitor, IPOPT, and APOPT, but does not specify their version numbers.
Experiment Setup Yes We use the setup from the federated learning benchmark of (Reddi et al., 2021), restated for convenience in Appendix I. Figure 5 shows how QUIC-FL is competitive with the asymptotically slower EDEN and markedly more accurate than other alternatives. (and values in Table 5) includes: Clients per round 10, Rounds 1200, Batch size 4, Client lr 1e-2, Server lr 1e-3, Adam s ϵ 1e-8. For CIFAR-10 and CIFAR-100, we use the Res Net-9 (He et al., 2016) and Res Net-18 (He et al., 2016) architectires, and use learning rates of 0.1 and 0.05, respectively. For both datasets, the clients perform a single optimization step at each round. Our setting includes an SGD optimizer with a cross-entropy loss criterion, a batch size of 128, and a bit budget b = 1 for the DME methods.