reproducibilityindex.ai

Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence

Authors: Yuki Takezawa, Ryoma Sato, Han Bao, Kenta Niwa, Makoto Yamada

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted experiments with various topologies, demonstrating that the BASE-(k + 1) GRAPH enables various decentralized learning methods to achieve higher accuracy with better communication efficiency than the existing topologies.
Researcher Affiliation	Collaboration	Yuki Takezawa1,2 , Ryoma Sato1,2 , Han Bao1,2, Kenta Niwa3, Makoto Yamada2 1Kyoto University, 2OIST, 3NTT Communication Science Laboratories
Pseudocode	Yes	Algorithm 1 k-PEER HYPER-HYPERCUBE GRAPH Hk(V ), Algorithm 2 SIMPLE BASE-(k + 1) GRAPH Asimple k (V ), Algorithm 3 BASE-(k + 1) GRAPH Ak(V )
Open Source Code	Yes	Our code is available at https://github.com/yuki Takezawa/Base Graph.
Open Datasets	Yes	We used three datasets, Fashion MNIST [41], CIFAR-{10, 100} [14], and used Le Net [15] for Fashion MNIST and VGG-11 [32] for CIFAR-{10, 100}.
Dataset Splits	No	The paper mentions tuning the learning rate by grid search but does not specify validation dataset splits (e.g., percentage or sample count).
Hardware Specification	Yes	We ran all experiments on a server with eight Nvidia RTX 3090 GPUs.
Software Dependencies	No	The paper mentions 'Py Torch' but does not provide specific version numbers for it or any other software components.
Experiment Setup	Yes	The learning rate was tuned by the grid search and we used the cosine learning rate scheduler [22]. We distributed the training dataset to nodes by using Dirichlet distributions with hyperparameter α [7], conducting experiments in both homogeneous and heterogeneous data distribution settings. As α approaches zero, the data distributions held by each node become more heterogeneous. We repeated all experiments with three different seed values and reported their averages. See Sec. H for more detailed settings. Tables 3 and 4 list the detailed hyperparameter settings used in Secs. 6 and F.3.