reproducibilityindex.ai

Faster Adaptive Decentralized Learning Algorithms

Authors: Feihu Huang, Jianyu Zhao

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct some numerical experiments on training nonconvex machine learning tasks to verify the efficiency of our proposed algorithms.
Researcher Affiliation	Academia	1College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China 2MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing, China.
Pseudocode	Yes	Algorithm 1 Adaptive Momentum-Based Decentralized Optimization (Ada MDOS) Algorithm for Stochastic Optimization Algorithm 2 Adaptive Momentum-Based Decentralized Optimization (Ada MDOF) Algorithm for Finite-Sum Optimization
Open Source Code	No	The paper does not provide concrete access to source code, such as a specific repository link, explicit code release statement, or code in supplementary materials.
Open Datasets	Yes	We use public w8a and covertype datasets1. 1available at https://www.openml.org/ The MNIST dataset (Le Cun et al., 2010) The Tiny-Image Net dataset (Le & Yang, 2015)
Dataset Splits	No	The paper specifies training and testing examples/splits for some datasets (e.g., MNIST, Tiny-Image Net) but does not provide explicit details about a validation dataset split or cross-validation setup.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It only mentions a "decentralized network" and "clients" or "nodes" without further hardware specifications.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	In the experiment, we set the regularization parameter λ = 10 5, and use the same initial solution x0 = xi 0 = 0.01 ones(d, 1) for all i [m] for all algorithms. In the experiment, for fair comparison, we use the batch size b = 10 in all algorithms, and set β1 = β2 = 0.9 in the DADAM (Nazari et al., 2022) and DAMSGrad (Chen et al., 2023), and set β1 = 0.9 in the DAda Grad (Chen et al., 2023), and set ϱ = βt = ηt = 0.9 for all t 1 in our algorithms.