Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence

Authors: Yuki Takezawa, Ryoma Sato, Han Bao, Kenta Niwa, Makoto Yamada

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted experiments with various topologies, demonstrating that the BASE-(k + 1) GRAPH enables various decentralized learning methods to achieve higher accuracy with better communication efficiency than the existing topologies.
Researcher Affiliation Collaboration Yuki Takezawa1,2 , Ryoma Sato1,2 , Han Bao1,2, Kenta Niwa3, Makoto Yamada2 1Kyoto University, 2OIST, 3NTT Communication Science Laboratories
Pseudocode Yes Algorithm 1 k-PEER HYPER-HYPERCUBE GRAPH Hk(V ), Algorithm 2 SIMPLE BASE-(k + 1) GRAPH Asimple k (V ), Algorithm 3 BASE-(k + 1) GRAPH Ak(V )
Open Source Code Yes Our code is available at https://github.com/yuki Takezawa/Base Graph.
Open Datasets Yes We used three datasets, Fashion MNIST [41], CIFAR-{10, 100} [14], and used Le Net [15] for Fashion MNIST and VGG-11 [32] for CIFAR-{10, 100}.
Dataset Splits No The paper mentions tuning the learning rate by grid search but does not specify validation dataset splits (e.g., percentage or sample count).
Hardware Specification Yes We ran all experiments on a server with eight Nvidia RTX 3090 GPUs.
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for it or any other software components.
Experiment Setup Yes The learning rate was tuned by the grid search and we used the cosine learning rate scheduler [22]. We distributed the training dataset to nodes by using Dirichlet distributions with hyperparameter α [7], conducting experiments in both homogeneous and heterogeneous data distribution settings. As α approaches zero, the data distributions held by each node become more heterogeneous. We repeated all experiments with three different seed values and reported their averages. See Sec. H for more detailed settings. Tables 3 and 4 list the detailed hyperparameter settings used in Secs. 6 and F.3.