Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks

Authors: Timothy Castiglia, Anirban Das, Stacy Patterson

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the effectiveness of our algorithm in a multi-level network with slow workers via simulation-based experiments. and 6 EXPERIMENTS In this section, we show the performance of MLL-SGD compared to algorithms that do not account for hierarchy and heterogeneous worker rates.
Researcher Affiliation Academia T. Castiglia, A. Das, and S. Patterson are with the Department of Computer Science, Rensselaer Polytechnic Institute, 110 8th St, Troy, NY 12180, castit@rpi.edu, dasa2@rpi.edu, sep@cs.rpi.edu.
Pseudocode Yes Algorithm 1 Multi-Level Local SGD
Open Source Code Yes A CODE REPOSITORY The code used in our experiments can be found at: https://github.com/rpi-nsl/MLL-SGD. This code simulates a multi-level network with heterogeneous workers, and trains a model using MLL-SGD.
Open Datasets Yes We use the EMNIST (Cohen et al., 2017) and CIFAR-10 (Krizhevsky et al., 2009) datasets. and We rerun our first experiment from Figure 1 with logistic regression trained on MNIST dataset (Bottou et al., 1994).
Dataset Splits No The paper mentions 'training loss and test accuracy' and discusses parameters like 'step size' but does not explicitly describe the use of a validation set or provide details on training/validation/test splits.
Hardware Specification No The paper states, 'We conduct experiments using Pytorch 1.4.0 and Python 3.' but does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for these experiments.
Software Dependencies Yes We conduct experiments using Pytorch 1.4.0 and Python 3.
Experiment Setup Yes We train the CNN with a step size of 0.01. For Res Net, we use a standard approach of changing the step size from 0.1 to 0.01 to 0.001 over the course of training (He et al., 2016). We let qτ = 32 for all HL-SGD and MLL-SGD variations to be comparable with Local SGD.