A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning
Authors: Konstantin Mishchenko, Franck Iutzeler, Jérôme Malick, Massih-Reza Amini
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report some of the numerical results here; others are included in supplementary material. Comparison. We compare our algorithm DAve-RPG in terms of speed of convergence with its two main competitors in our distributed framework (with splitting of examples and no shared memory)... Delay-tolerance In Figure 4, we exhibit the resilience of our algorithm to delays by introducing additional simulated delays. We use the RCV1 dataset distributed evenly among M = 10 machines... Trade-off communication vs computation To illustrate the trade-off between communication and computation, we increase the number of inner-iterations of each worker (p = 1, 4, 7, 10). Scalability We ran the algorithm with different number of workers and measured its speedup as the inverse of the required time to reach suboptimality 10 2; this is represented in Fig. 6. |
| Researcher Affiliation | Academia | King Abdullah University of Science & Technology (KAUST) 2Univ. Grenoble Alpes 3CNRS and LJK. Correspondence to: Franck Iutzeler <franck.iutzeler@univ-grenoblealpes.fr> |
| Pseudocode | Yes | 3.3. Full algorithm Initialize x = x0, k = 0 while not converge do when an agent finishes an iteration: Receive an adjustment from it x x + Send x to the agent in return k k + 1 end Interrupt all slaves Output ˆx = proxγr (x) ... Initialize x = x0 i = x0 while not interrupted by master do Receive the most recent x Take x from the previous iteration Select a number of repetitions p Initialize = 0 for q = 1 to p do z proxγr(x + ) x+ z γ 1 j Si ℓj(z) + πi (x+ x) x x+ end Send the adjustment to the master end |
| Open Source Code | No | The paper does not contain an explicit statement or a link indicating that the source code for the methodology described is publicly available. |
| Open Datasets | Yes | We use publicly available datasets2, presented in Table 1. ... 2https://www.csie.ntu.edu.tw/ cjlin/ libsvmtools/datasets/ |
| Dataset Splits | No | The paper refers to using 'training set' for evaluation and does not explicitly specify split percentages or counts for training, validation, or test sets, nor does it reference predefined splits with citations for these specific partitions. |
| Hardware Specification | Yes | Each slave is allocated 1 CPU with 4 GB of memory. |
| Software Dependencies | No | The paper states: 'These algorithms were implemented in Python by using sockets for communications between the workers and the master.' However, it does not specify version numbers for Python or any other software libraries or solvers used. |
| Experiment Setup | Yes | with the hyperparameter λ2 fixed to the typical value λ2 = 1 n. ... We fixed the number of inner iterations per worker to p = 1. ... increase the number of inner-iterations of each worker (p = 1, 4, 7, 10). |