On the Convergence of Communication-Efficient Local SGD for Federated Learning
Authors: Hongchang Gao, An Xu, Heng Huang7510-7518
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | At last, extensive experiments are conducted to verify the performance of our proposed methods. and Extensive experimental results confirmed the effectiveness of our proposed methods. |
| Researcher Affiliation | Collaboration | 1 Department of Computer and Information Sciences, Temple University, PA, USA 2 Department of Electrical and Computer Engineering, University of Pittsburgh, PA, USA 3 JD Finance America Corporation, Mountain View, CA, USA |
| Pseudocode | Yes | Algorithm 1 Local SGD with Compressed Gradients and Algorithm 2 Momentum Local SGD with Compressed Gradients |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | CIFAR-10: We test Res Net-56 (He et al. 2016) with all the above mentioned algorithms on CIFAR-10 dataset (Krizhevsky, Hinton et al. 2009)., Image Net: We test Res Net-50 (He et al. 2016) on Image Net dataset (Russakovsky et al. 2015) 2. |
| Dataset Splits | No | The paper mentions training and testing but does not specify explicit train/validation/test dataset splits by percentage, count, or a reference to predefined splits. |
| Hardware Specification | Yes | All experiments are implemented in Py Torch (Paszke et al. 2019) and run on a cluster with NVIDIA Tesla P40 GPUs, where nodes are interconnected by a network with 40 Gbps bandwidth. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al. 2019)' but does not specify its version number or any other software dependencies with versions. |
| Experiment Setup | Yes | The base learning rate is 0.1, the weight decay is 5 10 4 and the total batch size is 128. For local SGD, the model is trained for 150 epoch in total, with a learning rate decay of 0.1 at epoch 100. For momentum local SGD, the model is trained for 200 epoch in total, with a learning rate decay of 0.1 at epoch 100 and 150. |