reproducibilityindex.ai

Moniqua: Modulo Quantized Communication in Decentralized SGD

Authors: Yucheng Lu, Christopher De Sa

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate empirically that Moniqua converges faster with respect to wall clock time than other quantized decentralized algorithms. We also show that Moniqua is robust to very low bit-budgets, allowing 1-bit-per-parameter communication without compromising validation accuracy when training Res Net20 and Res Net110 on CIFAR10.6. Experiments In this section, we evaluate Moniqua empirically. First, we compare Moniqua and other quantized decentralized training algorithms convergence under different network conﬁgurations. Second, we compare the validation performance of them under extreme bit-budget. Then we investigate Moniqua s scalability on D2 and AD-PSGD. Finally, we introduce several useful techniques for running Moniqua efﬁciently.
Researcher Affiliation	Academia	Yucheng Lu 1 Christopher De Sa 1 1Department of Computer Science, Cornell University, Ithaca, New York, United States.
Pseudocode	Yes	Algorithm 1 Pseudo-code of Moniqua on worker i
Open Source Code	No	The paper does not explicitly state that source code for Moniqua is released, provide a repository link, or mention its inclusion in supplementary materials.
Open Datasets	Yes	We launch 8 workers connected in a ring topology and train a Res Net20 (He et al., 2016) model on CIFAR10 (Krizhevsky et al., 2014).
Dataset Splits	No	The paper mentions 'validation accuracy' and 'final test accuracy' but does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., percentages or sample counts).
Hardware Specification	Yes	We launch one instance as one worker in previous formulation, each conﬁgured with a 2-core CPU with 4 GB memory and an NVIDIA Tesla P100 GPU.
Software Dependencies	No	All the models and training scripts in this section are implemented in Py Torch and run on Google Cloud Platform. We use MPICH as the communication backend. All the instances are running Ubuntu 16.04... The paper mentions software but does not specify version numbers for PyTorch or MPICH.
Experiment Setup	Yes	In the experiment, we adopt the following hyperparameters for Moniqua: {Momentum = 0.9, Weight Decay = 5e 4, Batch Size = 128, Initial Step Size = 0.1, θk = 2.0}. In the extreme-bitbudget experiment, we further use adopt the average ratio {γ = 5e 3}.