reproducibilityindex.ai

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local Computations

Authors: Debraj Basu, Deepesh Data, Can Karakus, Suhas Diggavi

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use Qsparse-local-SGD to train Res Net-50 on Image Net, and show that it results in signiﬁcant savings over the state-of-the-art, in the number of bits transmitted to reach target accuracy.
Researcher Affiliation	Collaboration	Debraj Basu Adobe Inc. dbasu@adobe.com Deepesh Data UCLA deepeshdata@ucla.edu Can Karakus Amazon Inc. cakarak@amazon.com Suhas Diggavi UCLA suhasdiggavi@ucla.edu
Pseudocode	Yes	Algorithm 1 Qsparse-local-SGD
Open Source Code	Yes	Our implementation is available at https://github.com/karakusc/horovod/tree/qsparselocal.
Open Datasets	Yes	We implement Qsparse-local-SGD for Res Net-50 using the Image Net dataset, and show that we achieve target accuracies... We also perform analogous experiments on the MNIST [19] handwritten digits dataset for softmax regression with a standard l2 regularizer...
Dataset Splits	No	The paper mentions using ImageNet and MNIST datasets and discusses training and testing, but does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages or sample counts for each split).
Hardware Specification	Yes	We train Res Net-50 [13] (which has d = 25, 610, 216 parameters) on Image Net dataset, using 8 NVIDIA Tesla V100 GPUs.
Software Dependencies	No	The paper mentions using the 'Horovod framework [28]' but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup	Yes	We use a learning rate schedule consisting of 5 epochs of linear warmup, followed by a piecewise decay of 0.1 at epochs 30, 60 and 80, with a batch size of 256 per GPU. For experiments, we focus on SGD with momentum of 0.9, applied on the local iterations of the workers.