reproducibilityindex.ai

ATOMO: Communication-efficient Learning via Atomic Sparsification

Authors: Hongyi Wang, Scott Sievert, Shengchao Liu, Zachary Charles, Dimitris Papailiopoulos, Stephen Wright

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present an empirical study of Spectral-ATOMO and compare it to the recently proposed QSGD [14], and Tern Grad [16], on a different neural network models and data sets, under real distributed environments.
Researcher Affiliation	Academia	1Department of Computer Sciences, 2Department of Electrical and Computer Engineering University of Wisconsin-Madison
Pseudocode	Yes	Algorithm 1: ATOMO probabilities
Open Source Code	Yes	2code available at: https://github.com/hwang595/ATOMO
Open Datasets	Yes	We conducted our experiments on various models, datasets, learning tasks, and neural network models as detailed in Table 2. Dataset CIFAR-10 CIFAR-100 SVHN
Dataset Splits	No	The paper mentions training data and mini-batch SGD but does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification	Yes	Our entire experimental pipeline is implemented in Py Torch [50] with mpi4py [49], and deployed on either g2.2xlarge, m5.2xlarge and m5.4xlarge instances in Amazon AWS EC2. We conducted our experiments on various models, datasets, learning tasks, and neural network models as detailed in Table 2.
Software Dependencies	No	Our entire experimental pipeline is implemented in Py Torch [50] with mpi4py [49]. The paper mentions the software used but does not provide specific version numbers for PyTorch or mpi4py.
Experiment Setup	Yes	In our experiments, we use data augmentation (random crops, and ﬂips), and tuned the step-size for every different setup as shown in Table 5 in Appendix D. Momentum and regularization terms are switched off to make the hyperparamter search tractable and the results more legible. ... We ran Res Net-34 on CIFAR-10 using mini-batch SGD with batch size 512 split among compute nodes.