Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction

Authors: Kenta Niwa, Guoqiang Zhang, W. Bastiaan Kleijn, Noboru Harada, Hiroshi Sawada, Akinori Fujino

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated the constructed algorithms by investigating the learning curves for the case that statistically heterogeneous data subsets are placed at the local nodes.
Researcher Affiliation Collaboration 1NTT Communication Science Laboratories, Kyoto, Japan 2NTT Media Intelligence Laboratories, Tokyo, Japan 3University of Technology Sydney, Sydney, Australia 4Victoria University of Wellington, Wellington, New Zealand.
Pseudocode Yes Algorithm 1 Previous ECL (Niwa et al., 2020) and Algorithm 2 Proposed ECL-ISVR
Open Source Code Yes A part of our source code5 is available. 5https://github.com/nttcslab/ecl-isvr
Open Datasets Yes Fashion MNIST (Xiao et al., 2017) consists of 28 28 pixel of gray-scale images in 10 classes. The CIFAR-10 data set consists of 32 32 color images in 10 object classes (Krizhevsky et al., 2009)
Dataset Splits No The paper mentions training and test data but does not explicitly specify a validation split or how it was used for model selection/tuning.
Hardware Specification Yes We constructed software that runs on a server that has 8 GPUs (NVIDIA Ge Force RTX 2080Ti) with 2 CPUs (Intel Xeon Gold 5222, 3.80 GHz).
Software Dependencies Yes Py Torch (v1.6.0) with CUDA (v10.2) and Gloo4 for node communication was used.
Experiment Setup Yes squired L2 model normalization with weight 0.01 is added to the cost function. The step-size ยต = 0.002 and the mini-batch size 100 are used in all settings. The communication for each edge is conducted once per K=8 local updates on average, with R=8, 800 rounds for (N1) and R=5, 600 rounds for (N2).