Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent

Authors: Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh9037-9045

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct our experiments in Pytorch, testing convergence and speedup for residual networks (He et al. 2016) applied to image classification tasks on the CIFAR-10/100 datasets (Krizhevsky and Hinton 2009). Hyperparameter values are standard, and are given in the full version (Nadiradze et al. 2020). Experiments are performed on two AWS EC2 P3.2xlarge instances, each with a V100 GPU, and averaged over 3 trials.
Researcher Affiliation Academia Giorgi Nadiradze,1 Ilia Markov, 1 Bapi Chatterjee, 1 Vyacheslav Kungurtsev, 2 Dan Alistarh 1 1 IST Austria 2 Czech Technical University in Prague
Pseudocode No We refer the reader to (Nadiradze et al. 2020) for a full description, including pseudocode. The provided paper extract does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper states 'We implement the elastic scheduler on top of the Horovod distributed training framework (Sergeev and Del Balso 2018)' but does not provide a link to its own open-source code for the described methodology. It refers to a full version of the paper (Nadiradze et al. 2020) for more details, but no explicit code release statement is found in this extract.
Open Datasets Yes We conduct our experiments in Pytorch, testing convergence and speedup for residual networks (He et al. 2016) applied to image classification tasks on the CIFAR-10/100 datasets (Krizhevsky and Hinton 2009).
Dataset Splits No The paper mentions 'Validation accuracy' and shows it in Figure 1, but does not provide specific details on the training/validation/test dataset splits or their percentages within this text.
Hardware Specification Yes Experiments are performed on two AWS EC2 P3.2xlarge instances, each with a V100 GPU, and averaged over 3 trials.
Software Dependencies No The paper mentions using 'Pytorch', 'Tensorflow', and 'Horovod distributed training framework', but does not provide specific version numbers for these software dependencies.
Experiment Setup No The paper states that 'Hyperparameter values are standard, and are given in the full version (Nadiradze et al. 2020)', meaning comprehensive hyperparameter details are not provided in this extract.