Topology-aware Generalization of Decentralized SGD
Authors: Tongtian Zhu, Fengxiang He, Lan Zhang, Zhengyang Niu, Mingli Song, Dacheng Tao
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments of VGG-11 and Res Net-18 on CIFAR-10, CIFAR-100 and Tiny-Image Net justify our theory. |
| Researcher Affiliation | Collaboration | 1College of Computer Science and Technology, Zhejiang University 2Shanghai Institute for Advanced Study of Zhejiang University 3JD Explore Academy, JD.com Inc. 4School of Computer Science and Technology, University of Science and Technology of China 5Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 6School of Computer Science, Wuhan University 7Zhejiang University City College. |
| Pseudocode | No | The paper describes an update rule in Equation (1) but does not provide a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | Code is available at https://github.com/Raiden-Zhu/ Generalization-of-DSGD. |
| Open Datasets | Yes | The models are trained on CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009) and Tiny Image Net (Le & Yang, 2015), three popular benchmark image classification datasets. |
| Dataset Splits | Yes | The CIFAR-10 dataset consists of 60,000 32 32 color images across 10 classes, with each class containing 5,000 training and 1,000 testing images. The CIFAR-100 dataset also consists of 60,000 32 32 color images, except that it has 100 classes, each class containing 5,00 training and 1,00 testing images. Tiny Image Net contains 120,000 64 64 color images in 200 classes, each class containing 500 training images, 50 validation images, and 50 test images. |
| Hardware Specification | Yes | All our experiments are conducted on a computing cluster with GPUs of NVIDIA Tesla V100 16GB and CPUs of Intel Xeon Gold 6140 CPU @ 2.30GHz. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al., 2019)' but does not specify its version number, nor other software dependencies with versions. |
| Experiment Setup | Yes | The local batch size is set as 64. The initial learning rate is set as 0.1 and will be divided by 10 when the model has accessed 2/5 and 4/5 of the total number of iterations. |