reproducibilityindex.ai

Accelerating Gossip SGD with Periodic Global Averaging

Authors: Yiming Chen, Kun Yuan, Yingya Zhang, Pan Pan, Yinghui Xu, Wotao Yin

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results of large-scale training on image classiﬁcation (Res Net50) and language modeling (BERT) validate our theoretical ﬁndings.
Researcher Affiliation	Industry	1Alibaba Group, Hangzhou, China.
Pseudocode	Yes	Algorithm 1 Gossip-PGA
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	The Image Net-1k (Deng et al., 2009) dataset consists of 1,281,167 training images and 50,000 validation images in 1000 classes.
Dataset Splits	Yes	The Image Net-1k (Deng et al., 2009) dataset consists of 1,281,167 training images and 50,000 validation images in 1000 classes.
Hardware Specification	No	The paper mentions training on '256 GPUs' and '64 GPUs' but does not specify the exact GPU models (e.g., NVIDIA A100, V100) or other hardware details like CPU, memory, or specific cloud instances.
Software Dependencies	No	The paper mentions PyTorch in its references but does not provide specific version numbers for PyTorch or any other software libraries or dependencies used in the experiments.
Experiment Setup	Yes	The learning rate is warmed up in the ﬁrst 5 epochs and is decayed by a factor of 10 at 30, 60 and 90 epochs. We set the period to 6 for both Local SGD and Gossip-PGA. In Gossip-AGA, the period is set to 4 initially and changed adaptively afterwards, roughly 9% iterations conduct global averaging.