Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Choice of Learning Rate for Local SGD

Authors: Lukas Balles, Prabhu Teja S, Cedric Archambeau

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that the optimal learning rate for Local SGD differs substantially from that of SGD, and when using it the performance of Local SGD matches that of SGD. However, this performance comes at the cost of added training iterations, rendering Local SGD faster than SGD only when communication is much more time-consuming than computation.
Researcher Affiliation	Industry	Lukas Balles EMAIL Aleph Alpha, Heidelberg, Germany. Work done at AWS. Prabhu Teja S EMAIL Amazon Web Services, Berlin, Germany. Cédric Archambeau EMAIL Helsing, Berlin, Germany. Work done at AWS.
Pseudocode	Yes	Algorithm 1 Automatic learning rate scaling for Local SGD Local Ada Scale.
Open Source Code	No	The paper mentions using a third-party library 'Fair Scale' and refers to GitHub repositories for model code, but it does not contain an explicit statement or a direct link from the authors releasing the source code specific to their methodology described in this paper.
Open Datasets	Yes	We train a Res Net-18 (He et al., 2016) on CIFAR-10 (Krizhevsky, 2009), a Wide Resnet-28-2 (Zagoruyko & Komodakis, 2016) on Image Net-32 (Chrabaszcz et al., 2017), and a Res Net-50 on Image Net (Deng et al., 2009; Russakovsky et al., 2015).
Dataset Splits	Yes	The target performance for each experiment is the performance we get when training on one worker (K = 1) with the standard hyperparameter settings (in Appendix H). For CIFAR-10 it is a top-1 accuracy of 93%, for Image Net32 it is a top-5 accuracy of 69%, and for Image Net it is a top-5 accuracy of 93%. See Appendix H.1 for attributions.
Hardware Specification	No	No specific hardware details (like GPU models, CPU models, or memory) used for the authors' experiments are provided in the paper. The paper discusses 'compute infrastructure' and 'hypothetical system' parameters (like relative communication overhead 'm'), but not the actual hardware used for their empirical evaluations.
Software Dependencies	No	The paper mentions using 'Fair Scale' and 'torchvision library' and refers to GitHub repositories for model code (e.g., 'pytorch-cifar') but does not specify exact version numbers for these software components.
Experiment Setup	Yes	Training hyperparameters are listed in the following table: Dataset γbase Momentum Weight decay LR schedule Epochs CIFAR-10 0.1 0.9 5 10 4 Cosine-decay 200 Image Net32 0.01 0.9 5 10 4 Step ( 0.5 every 10 epochs) 40 Image Net 0.1 0.9 5 10 4 Step ( 0.1 every 30 epochs) 90