Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction
Authors: Kenta Niwa, Guoqiang Zhang, W. Bastiaan Kleijn, Noboru Harada, Hiroshi Sawada, Akinori Fujino
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated the constructed algorithms by investigating the learning curves for the case that statistically heterogeneous data subsets are placed at the local nodes. |
| Researcher Affiliation | Collaboration | 1NTT Communication Science Laboratories, Kyoto, Japan 2NTT Media Intelligence Laboratories, Tokyo, Japan 3University of Technology Sydney, Sydney, Australia 4Victoria University of Wellington, Wellington, New Zealand. |
| Pseudocode | Yes | Algorithm 1 Previous ECL (Niwa et al., 2020) and Algorithm 2 Proposed ECL-ISVR |
| Open Source Code | Yes | A part of our source code5 is available. 5https://github.com/nttcslab/ecl-isvr |
| Open Datasets | Yes | Fashion MNIST (Xiao et al., 2017) consists of 28 28 pixel of gray-scale images in 10 classes. The CIFAR-10 data set consists of 32 32 color images in 10 object classes (Krizhevsky et al., 2009) |
| Dataset Splits | No | The paper mentions training and test data but does not explicitly specify a validation split or how it was used for model selection/tuning. |
| Hardware Specification | Yes | We constructed software that runs on a server that has 8 GPUs (NVIDIA Ge Force RTX 2080Ti) with 2 CPUs (Intel Xeon Gold 5222, 3.80 GHz). |
| Software Dependencies | Yes | Py Torch (v1.6.0) with CUDA (v10.2) and Gloo4 for node communication was used. |
| Experiment Setup | Yes | squired L2 model normalization with weight 0.01 is added to the cost function. The step-size ยต = 0.002 and the mini-batch size 100 are used in all settings. The communication for each edge is conducted once per K=8 local updates on average, with R=8, 800 rounds for (N1) and R=5, 600 rounds for (N2). |