Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent
Authors: Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh9037-9045
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct our experiments in Pytorch, testing convergence and speedup for residual networks (He et al. 2016) applied to image classification tasks on the CIFAR-10/100 datasets (Krizhevsky and Hinton 2009). Hyperparameter values are standard, and are given in the full version (Nadiradze et al. 2020). Experiments are performed on two AWS EC2 P3.2xlarge instances, each with a V100 GPU, and averaged over 3 trials. |
| Researcher Affiliation | Academia | Giorgi Nadiradze,1 Ilia Markov, 1 Bapi Chatterjee, 1 Vyacheslav Kungurtsev, 2 Dan Alistarh 1 1 IST Austria 2 Czech Technical University in Prague |
| Pseudocode | No | We refer the reader to (Nadiradze et al. 2020) for a full description, including pseudocode. The provided paper extract does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'We implement the elastic scheduler on top of the Horovod distributed training framework (Sergeev and Del Balso 2018)' but does not provide a link to its own open-source code for the described methodology. It refers to a full version of the paper (Nadiradze et al. 2020) for more details, but no explicit code release statement is found in this extract. |
| Open Datasets | Yes | We conduct our experiments in Pytorch, testing convergence and speedup for residual networks (He et al. 2016) applied to image classification tasks on the CIFAR-10/100 datasets (Krizhevsky and Hinton 2009). |
| Dataset Splits | No | The paper mentions 'Validation accuracy' and shows it in Figure 1, but does not provide specific details on the training/validation/test dataset splits or their percentages within this text. |
| Hardware Specification | Yes | Experiments are performed on two AWS EC2 P3.2xlarge instances, each with a V100 GPU, and averaged over 3 trials. |
| Software Dependencies | No | The paper mentions using 'Pytorch', 'Tensorflow', and 'Horovod distributed training framework', but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | No | The paper states that 'Hyperparameter values are standard, and are given in the full version (Nadiradze et al. 2020)', meaning comprehensive hyperparameter details are not provided in this extract. |