Error Compensated Distributed SGD Can Be Accelerated

Authors: Xun Qian, Peter Richtarik, Tong Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we experimentally study the performance of error compensated L-Katyusha (ECLK) used with several contraction compressors on the logistic regression problem for binary classification, x 7 log 1 + exp( yi AT i x) + λ... We use the datasets a5a, a9a, w6a, w8a, phishing, and mushrooms from the LIBSVM library [Chang and Lin, 2011].
Researcher Affiliation Academia Xun Qian xun.qian@kaust.edu.sa Peter Richtárik peter.richtarik@kaust.edu.sa Tong Zhang tongzhang@ust.hk King Abdullah University of Science and Technology, Thuwal, Saudi Arabia King Abdullah University of Science and Technology, Thuwal, Saudi Arabia; Moscow Institute of Physics and Technology, Dolgoprudny, Russia Hong Kong University of Science and Technology, Hong Kong
Pseudocode Yes Algorithm 1 Error Compensated Loopless Katyusha (ECLK)
Open Source Code Yes Code and instructions are in the supplemental material.
Open Datasets Yes We use the datasets a5a, a9a, w6a, w8a, phishing, and mushrooms from the LIBSVM library [Chang and Lin, 2011].
Dataset Splits No The paper states: "In all the experiments, we search for the optimal stepsize for all tested algorithms." and "We calculate the theoretical Lf, L, and L as Lth f , Lth, and Lth, respectively. Then we choose Lf = t Lth f , L = t Lth, and L = t Lth, and search for the best t in the set t {10 k | k = 0, 1, 2, ...}.". While it describes a hyperparameter search, it does not explicitly provide the training/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification No The paper states: "We run the experiments on a laptop, and we did not count the time. Hence the results are independent of the amount of compute and the type of resources." This statement provides a general type of device ("laptop") but lacks specific hardware details such as CPU model, GPU model, or memory.
Software Dependencies Yes We use Python 3.7 to perform the experiments.
Experiment Setup Yes The regularization parameter was set to λ = 10 3. The number of nodes in our experiments is n = 20. We use the parameter setting in Theorem 3.8 (i) for ECLK. We calculate the theoretical Lf, L, and L as Lth f , Lth, and Lth, respectively. Then we choose Lf = t Lth f , L = t Lth, and L = t Lth, and search for the best t in the set t {10 k | k = 0, 1, 2, ...}.