reproducibilityindex.ai

On the Convergence of Communication-Efficient Local SGD for Federated Learning

Authors: Hongchang Gao, An Xu, Heng Huang7510-7518

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	At last, extensive experiments are conducted to verify the performance of our proposed methods. and Extensive experimental results conﬁrmed the effectiveness of our proposed methods.
Researcher Affiliation	Collaboration	1 Department of Computer and Information Sciences, Temple University, PA, USA 2 Department of Electrical and Computer Engineering, University of Pittsburgh, PA, USA 3 JD Finance America Corporation, Mountain View, CA, USA
Pseudocode	Yes	Algorithm 1 Local SGD with Compressed Gradients and Algorithm 2 Momentum Local SGD with Compressed Gradients
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	CIFAR-10: We test Res Net-56 (He et al. 2016) with all the above mentioned algorithms on CIFAR-10 dataset (Krizhevsky, Hinton et al. 2009)., Image Net: We test Res Net-50 (He et al. 2016) on Image Net dataset (Russakovsky et al. 2015) 2.
Dataset Splits	No	The paper mentions training and testing but does not specify explicit train/validation/test dataset splits by percentage, count, or a reference to predefined splits.
Hardware Specification	Yes	All experiments are implemented in Py Torch (Paszke et al. 2019) and run on a cluster with NVIDIA Tesla P40 GPUs, where nodes are interconnected by a network with 40 Gbps bandwidth.
Software Dependencies	No	The paper mentions 'Py Torch (Paszke et al. 2019)' but does not specify its version number or any other software dependencies with versions.
Experiment Setup	Yes	The base learning rate is 0.1, the weight decay is 5 10 4 and the total batch size is 128. For local SGD, the model is trained for 150 epoch in total, with a learning rate decay of 0.1 at epoch 100. For momentum local SGD, the model is trained for 200 epoch in total, with a learning rate decay of 0.1 at epoch 100 and 150.