Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go?
Authors: Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter Glynn, Yinyu Ye, Li-Jia Li, Li Fei-Fei
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Numerical results. To validate our analysis, we test the convergence of Algorithm 2 against a standard Rosenbrock test function with d = 101 degrees of freedom... Our results are shown in Fig. 1. Starting from a random (but otherwise fixed) initial condition, we ran S = 100 realizations of DASGD (with and without delays). We then plotted a randomly chosen trajectory ( test sample in Fig. 1), the sample average, and the min/max over all samples at every update epoch. |
| Researcher Affiliation | Collaboration | 1Stanford University, Stanford, USA 2Univ. Grenoble Alpes, CNRS, Inria, LIG, 38000 Grenoble, France. 3Google, Mountain View, USA. |
| Pseudocode | Yes | Algorithm 1 Running SGD on a Master-Slave Architecture, Algorithm 2 Distributed asynchronous stochastic gradient descent, Algorithm 3 Master s DAGD Update |
| Open Source Code | No | No explicit statement or link indicating the availability of open-source code for the methodology described in the paper. |
| Open Datasets | Yes | To validate our analysis, we test the convergence of Algorithm 2 against a standard Rosenbrock test function with d = 101 degrees of freedom, i.e., [100(xi+1 x2i )2 + (1 xi)2], with xi [0, 2], i = 1, . . . , 101. |
| Dataset Splits | No | The paper describes numerical experiments on a mathematical test function but does not specify train/validation/test dataset splits in the conventional machine learning sense. |
| Hardware Specification | No | The paper mentions numerical experiments but does not provide any specific hardware details like GPU/CPU models or cloud resources used. |
| Software Dependencies | No | The paper does not provide any specific software dependencies with version numbers. |
| Experiment Setup | Yes | In both cases, Algorithm 2 was run with a decreasing step-size of the form αn 1/(n log n) and stochastic gradients drawn from a standard multivariate Gaussian distribution (i.e., zero mean and identity covariance matrix). |