reproducibilityindex.ai

Adaptive Random Walk Gradient Descent for Decentralized Optimization

Authors: Tao Sun, Dongsheng Li, Bao Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experimental Results We contrast the performance of adaptive and non-adaptive random walk algorithms for training machine learning models, including logistic regression (LR), multi-layer perceptron (MLP), and convolutional neural networks (CNNs). We evaluate the performance of the models on the benchmark MNIST and CIFAR10 image classiﬁcation tasks, where MNIST/CIFAR10 contains 60K/50K and 10K/10K images for training and test, respectively.
Researcher Affiliation	Academia	1College of Computer, National University of Defense Technology, Hunan, China. 2Department of Mathematics and Scientiﬁc Computing and Imaging Institute, University of Utah.
Pseudocode	Yes	Algorithm 1 Adaptive Random Walk Gradient Descent
Open Source Code	No	The paper does not provide any explicit statements or links indicating that source code for the described methodology is publicly available.
Open Datasets	Yes	We evaluate the performance of the models on the benchmark MNIST and CIFAR10 image classiﬁcation tasks, where MNIST/CIFAR10 contains 60K/50K and 10K/10K images for training and test, respectively.
Dataset Splits	No	The paper states 'MNIST/CIFAR10 contains 60K/50K and 10K/10K images for training and test, respectively' and 'We randomly partition the training data into ten even groups in an i.i.d. fashion', but it does not specify details about a validation split.
Hardware Specification	No	The paper describes the experimental setup in terms of models, datasets, and hyperparameters, but it does not provide specific hardware details such as GPU/CPU models or memory used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation.
Experiment Setup	Yes	In training, we set the batch size to be 128. We ﬁne-tune the step size for both adaptive and non-adaptive random walk gradient descent, and we use the initial learning rate of 0.003 and 0.1 for adaptive and non-adaptive algorithms, respectively. The momentum hyperparameter is set to 0.9 for both solvers. Moreover, we set the weight decay for both adaptive and non-adaptive algorithms to be 5 × 10−4.