Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients

Authors: Jun Sun, Tianyi Chen, Georgios Giannakis, Zaiyue Yang

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient- and stochastic gradient-based algorithms. and Numerical tests and conclusions
Researcher Affiliation Academia Jun Sun Zhejiang University Hangzhou, China 310027 sunjun16sj@gmail.com Tianyi Chen Rensselaer Polytechnic Institute Troy, New York 12180 chent18@rpi.edu Georgios B. Giannakis University of Minnesota, Twin Cities Minneapolis, MN 55455 georgios@umn.edu Zaiyue Yang Southern U. of Science and Technology Shenzhen, China 518055 yangzy3@sustc.edu.cn
Pseudocode Yes Algorithm 1 QGD and Algorithm 2 LAQ
Open Source Code No The paper does not contain an explicit statement or link indicating the release of source code for the described methodology.
Open Datasets Yes The dataset we use is MNIST [15], which are uniformly distributed across M = 10 workers. and [15] Yann Le Cun, Corinna Cortes, and CJ Burges. Mnist handwritten digit database. AT&T Labs [Online]. Available: http://yann. lecun. com/exdb/mnist, 2:18, 2010.
Dataset Splits No The paper mentions using MNIST and other datasets but does not provide specific train/validation/test dataset splits (e.g., percentages or sample counts) in the main text. It refers to 'supplementary materials' for more details which are not provided here.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names with versions) needed to replicate the experiment.
Experiment Setup Yes In the experiments, we set D = 10, 1 = 2 = , D = 0.8/D, t = 100; see the detailed setup in the supplementary materials. To benchmark LAQ, we compare it with two classes of algorithms, gradient-based algorithms and minibatch stochastic gradient-based algorithms corresponding to the following two tests. The number of bits per coordinate is set as b = 3 for logistic regression and 8 for neural network, respectively. Stepsize is set as = 0.02 for both algorithms.