Hybrid Local SGD for Federated Learning with Heterogeneous Communications

Authors: Yuanxiong Guo, Ying Sun, Rui Hu, Yanmin Gong

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also perform extensive experiments and show that the use of hybrid model aggregation via D2D and D2S communications in HL-SGD can largely speed up the training time of federated learning. We conduct extensive empirical experiments on two common benchmarks under realistic network settings to validate the established theoretical results of HL-SGD. Our experimental results show that HL-SGD can largely accelerate the learning process and speed up the runtime.
Researcher Affiliation Academia Yuanxiong Guo The University of Texas at San Antonio San Antonio, Texas 78249 USA yuanxiong.guo@utsa.edu Ying Sun Pennsylvania State University State College, PA 16801 USA ysun@psu.edu Rui Hu & Yanmin Gong The University of Texas at San Antonio San Antonio, Texas 78249 USA {rui.hu@my.,yanmin.gong@}utsa.edu
Pseudocode Yes Algorithm 1 HL-SGD: Hybrid Local SGD Input: initial global model x0, learning rate η, communication graph Gk and mixing matrix Wk for all clusters k [K], and fraction of sampled devices in each cluster p. Output: final global model x R
Open Source Code No The paper states: "The algorithms are implemented by Py Torch. More details are provided in Appendix F." However, it does not provide an explicit statement about releasing the code, nor a link to a code repository.
Open Datasets Yes We use two common datasets in FL literature (Mc Mahan et al., 2017; Reddi et al., 2021; Wang et al., 2020): Federated Extended MNIST (Caldas et al., 2019) (FEMNIST) and CIFAR-10 (Krizhevsky et al., 2009).
Dataset Splits No The paper mentions partitioning data across devices and using the original testing set, but does not explicitly provide details about specific training/validation/test splits (e.g., percentages or counts) for reproduction. It only states: "The 62-class FEMNIST is built by partitioning the data in Extended MNIST (Cohen et al., 2017) based on the writer of the digit/character and has a naturally-arising device partitioning. CIFAR-10 is partitioned across all devices using a Dirichlet distribution Dir(0.1) as done in (Hsu et al., 2019; Yurochkin et al., 2019; Reddi et al., 2021; Wang et al., 2020)."
Hardware Specification Yes All experiments in this paper are conducted on a Linux server with 4 NVIDIA RTX 8000 GPUs.
Software Dependencies No The algorithms are implemented by Py Torch. More details are provided in Appendix F. The paper mentions Py Torch but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup Yes On the FEMNIST dataset, we fix the batch size as 30 and tune the learning rate η from {0.005, 0.01, 0.02, 0.05, 0.08} for each algorithm separately. On the CIFAR-10 dataset, we fix the batch size as 50 and tune η from {0.01, 0.02, 0.05, 0.08, 0.1} for each algorithm separately.