Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients
Authors: Jun Sun, Tianyi Chen, Georgios Giannakis, Zaiyue Yang
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient- and stochastic gradient-based algorithms. and Numerical tests and conclusions |
| Researcher Affiliation | Academia | Jun Sun Zhejiang University Hangzhou, China 310027 sunjun16sj@gmail.com Tianyi Chen Rensselaer Polytechnic Institute Troy, New York 12180 chent18@rpi.edu Georgios B. Giannakis University of Minnesota, Twin Cities Minneapolis, MN 55455 georgios@umn.edu Zaiyue Yang Southern U. of Science and Technology Shenzhen, China 518055 yangzy3@sustc.edu.cn |
| Pseudocode | Yes | Algorithm 1 QGD and Algorithm 2 LAQ |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating the release of source code for the described methodology. |
| Open Datasets | Yes | The dataset we use is MNIST [15], which are uniformly distributed across M = 10 workers. and [15] Yann Le Cun, Corinna Cortes, and CJ Burges. Mnist handwritten digit database. AT&T Labs [Online]. Available: http://yann. lecun. com/exdb/mnist, 2:18, 2010. |
| Dataset Splits | No | The paper mentions using MNIST and other datasets but does not provide specific train/validation/test dataset splits (e.g., percentages or sample counts) in the main text. It refers to 'supplementary materials' for more details which are not provided here. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with versions) needed to replicate the experiment. |
| Experiment Setup | Yes | In the experiments, we set D = 10, 1 = 2 = , D = 0.8/D, t = 100; see the detailed setup in the supplementary materials. To benchmark LAQ, we compare it with two classes of algorithms, gradient-based algorithms and minibatch stochastic gradient-based algorithms corresponding to the following two tests. The number of bits per coordinate is set as b = 3 for logistic regression and 8 for neural network, respectively. Stepsize is set as = 0.02 for both algorithms. |