Federated Learning under Arbitrary Communication Patterns
Authors: Dmitrii Avdiukhin, Shiva Kasiviswanathan
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we demonstrate the effectiveness of local SGD with asynchronous updates as compared to regular SGD or a synchronous local SGD where all communication takes place together. Our focus will be on illustrating the dependence of model accuracy on the total communication. We also investigate the role of (non)iidness of the clients data distributions. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Indiana University, Bloomington, IN, USA 2Amazon, Palo Alto, CA, USA. |
| Pseudocode | Yes | Algorithm 1 ASYNCCOMMSGD |
| Open Source Code | Yes | The entire code is provided in supplementary material. |
| Open Datasets | Yes | We perform our evaluation on the following datasets: MNIST, FASHION-MNIST, CIFAR-10. Dataset and detailed network descriptions are given in Ap pendix C The entire code is provided in supplementary material. |
| Dataset Splits | No | The paper describes how data is partitioned across clients based on a "mixing rate µ" to simulate non-identical data distributions (e.g., "for each client, (1 µ) fraction of data is selected from the class corresponding to the client, while µ fraction is selected from a random class"). However, it does not provide specific train/validation/test dataset splits by percentage or sample count for the overall datasets (e.g., 80/10/10 split). |
| Hardware Specification | No | The paper mentions using "a single-machine simulation of FL computation" but does not provide any specific details about the hardware used, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., "Python 3.8", "PyTorch 1.9"). It only mentions datasets and network descriptions in Appendix C. |
| Experiment Setup | Yes | For MNIST and FASHION-MNIST, "At each round, clients process 103 samples (within the round, the client locally performs minibatch gradient descent with batch size 20)." It also specifies model architectures such as "a one-layer neural network with softmax activation" and "ResNet-34 (He et al., 2016) without batch normalization." The step size γ is given for theoretical analysis. |