reproducibilityindex.ai

Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks

Authors: Timothy Castiglia, Anirban Das, Stacy Patterson

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the effectiveness of our algorithm in a multi-level network with slow workers via simulation-based experiments. and 6 EXPERIMENTS In this section, we show the performance of MLL-SGD compared to algorithms that do not account for hierarchy and heterogeneous worker rates.
Researcher Affiliation	Academia	T. Castiglia, A. Das, and S. Patterson are with the Department of Computer Science, Rensselaer Polytechnic Institute, 110 8th St, Troy, NY 12180, castit@rpi.edu, dasa2@rpi.edu, sep@cs.rpi.edu.
Pseudocode	Yes	Algorithm 1 Multi-Level Local SGD
Open Source Code	Yes	A CODE REPOSITORY The code used in our experiments can be found at: https://github.com/rpi-nsl/MLL-SGD. This code simulates a multi-level network with heterogeneous workers, and trains a model using MLL-SGD.
Open Datasets	Yes	We use the EMNIST (Cohen et al., 2017) and CIFAR-10 (Krizhevsky et al., 2009) datasets. and We rerun our ﬁrst experiment from Figure 1 with logistic regression trained on MNIST dataset (Bottou et al., 1994).
Dataset Splits	No	The paper mentions 'training loss and test accuracy' and discusses parameters like 'step size' but does not explicitly describe the use of a validation set or provide details on training/validation/test splits.
Hardware Specification	No	The paper states, 'We conduct experiments using Pytorch 1.4.0 and Python 3.' but does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for these experiments.
Software Dependencies	Yes	We conduct experiments using Pytorch 1.4.0 and Python 3.
Experiment Setup	Yes	We train the CNN with a step size of 0.01. For Res Net, we use a standard approach of changing the step size from 0.1 to 0.01 to 0.001 over the course of training (He et al., 2016). We let qτ = 32 for all HL-SGD and MLL-SGD variations to be comparable with Local SGD.