On the Optimal Time Complexities in Decentralized Stochastic Asynchronous Optimization

Authors: Alexander Tyurin, Peter Richtarik

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section J, we present experiments with quadratic optimization problems, logistic regression, and a neural network to substantiate our theoretical findings. Here, we focus on highlighting the results from the logistic regression experiments: Figures 3, 5, 6, 7, 8, 9, 10.
Researcher Affiliation Academia Alexander Tyurin KAUST , AIRI , Skoltech Peter Richtárik KAUST; King Abdullah University of Science and Technology, Thuwal, Saudi Arabia AIRI, Moscow, Russia Skolkovo Institute of Science and Technology, Moscow, Russia; The research reported in this publication was supported by funding from King Abdullah University of Science and Technology (KAUST): i) KAUST Baseline Research Scheme, ii) Center of Excellence for Generative AI, under award number 5940, iii) SDAIA-KAUST Center of Excellence in Artificial Intelligence and Data Science. The work of A.T. was partially supported by the Analytical center under the RF Government (subsidy agreement 000000D730321P5Q0002, Grant No. 70-2021-00145 02.11.2021).
Pseudocode Yes Algorithm 2 Fragile SGD; Algorithm 3 Process 0 (running in worker j); Algorithm 4 Process i (running in worker i); Algorithm 5 Amelie SGD; Algorithm 6 Process 0 (running in worker j); Algorithm 7 Process i (running in worker i).
Open Source Code Yes Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The code in the supplementary materials.
Open Datasets Yes On MNIST dataset (Le Cun et al., 2010); We test algorithms on an image recognition task, CIFAR10 (Krizhevsky et al., 2009).
Dataset Splits No The paper uses standard datasets (MNIST, CIFAR10) but does not explicitly state the proportions or methodology for training, validation, and test splits (e.g., '80/10/10 split').
Hardware Specification Yes The working environment was emulated in Python 3.8 with one Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz.
Software Dependencies No The paper mentions 'Python 3.8' as part of the working environment but does not list specific version numbers for other ancillary software dependencies like libraries or solvers used in the experiments.
Experiment Setup Yes In all methods we fine-tune step sizes from the set {2i | i [ 20, 20]}. In Fragile SGD, we fine-tune the batch size S from the set {10, 20, 40, 80, 120}. Theorem 4 states: We take γ = 1/2L, batch size S = max{ σ2/ε , 1}.