Resilient and Communication Efficient Learning for Heterogeneous Federated Systems
Authors: Zhuangdi Zhu, Junyuan Hong, Steve Drew, Jiayu Zhou
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Evaluation. In this section, we conduct extensive experiments to answer the following key questions, leaving more experimental details to the supplementary: 1. Is Fed Res Cu E resilient to system heterogeneity and unstable network connections? 2. Is Fed Res Cu E communication-efficient to reach satisfactory performance with fewer synchronization rounds, compared with the state-of-the-art? 3. Which components of Fed Res Cu E have contributed to its resiliency and communication efficiency? Results: Experiments below show that Fed Res Cu E notably outperforms related work in communication efficiency and asymptotic performance. Its superiority is consistent across different FL settings, and become more prominent under insufficient training data, heterogeneous model architectures, and unstable network connections. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA. 2Department of Electrical and Software Engineering, University of Calgary, Calgary, AB T2N 1N4, Canada.. Correspondence to: Zhuangdi Zhu <zhuzhuan@msu.edu>, Jiayu Zhou <jiayuz@msu.edu>. |
| Pseudocode | Yes | Algorithm 1 PROGRESSIVE SELF-DISTILLATION; Algorithm 2 Fed Res Cu E: Resilient and Communication Efficient Federated Learning |
| Open Source Code | No | The paper does not contain an explicit statement about releasing open-source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Dataset: We use CIFAR10 and CIFAR100 (Krizhevsky et al., 2009) to simulate edge users with i.i.d. data distributions. We also apply DIGITSFIVE (Peng et al., 2019) to simulate users with statistical heterogeneity, which is a multi-domain benchmark with five image datasets: MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), USPS (Hull, 1994), Synthetic, and MNIST-M (Ganin & Lempitsky, 2015). |
| Dataset Splits | No | The paper mentions 'training data' and 'testing data' but does not explicitly provide details about a validation set or its split, or how the training data is partitioned for internal validation purposes. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud computing instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Pytorch' for optimizer implementation and lists 'Optimizer SGD' and 'Python' in hyper-parameter configurations, but it does not specify version numbers for these or any other software dependencies, making it difficult to reproduce the exact software environment. |
| Experiment Setup | Yes | Hyper-parameter Configurations Domain Hyper-parameter Value Optimizer SGD learning rate 0.1 Momentum 0.9 Nesterov TRUE Weight decay 10^-4 Track training in Batch Norm FALSE Share Batch Norm TRUE Data category 10 # of active users 5 Random seeds for training 3, 5, 7 Batch Size 32 CIFAR10 Training Epoch 300 # of total users 20 Used training data 100%, 20% Column Granularity for P 0.05 DIGITSFIVE Training Epoch 100 # of total users 10 # users per domain 2 Used training data 5% Column Granularity for P 0.125 |