Delay-Adaptive Step-sizes for Asynchronous Learning
Authors: Xuyang Wu, Sindri Magnusson, Hamid Reza Feyzmahdavian, Mikael Johansson
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a classification problem show that the proposed delay-adaptive step-sizes accelerate the convergence of the two methods compared to the best known fixed step-sizes from the literature. 4. Numerical experiments Although the case for delay-adaptive step-sizes should be clear by now, we also demonstrate the end-effect on a simple machine learning problem. We consider classification problem on the training data sets of RCV1 (Lewis et al., 2004), MNIST (Deng, 2012), and CIFAR100 (Krizhevsky et al., 2009) |
| Researcher Affiliation | Collaboration | 1Division of Decision and Control Systems, EECS, KTH Royal Institute of Technology, Stockholm, Sweden 2Department of Computer and System Science, Stockholm University, Stockholm, Sweden 3ABB Corporate Research, V aster as, Sweden. |
| Pseudocode | Yes | Algorithm 1 PIAG with delay-tracking ... Algorithm 2 Async-BCD with delay tracking |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | Yes | We consider classification problem on the training data sets of RCV1 (Lewis et al., 2004), MNIST (Deng, 2012), and CIFAR100 (Krizhevsky et al., 2009) |
| Dataset Splits | No | The paper does not explicitly mention using a validation set or provide details about training/validation/test splits. It only states: "We split the samples in each data set into n = 8 batches and assign each batch to a single worker." |
| Hardware Specification | No | The paper states: "We run both PIAG and Async-BCD on a 10-core machine". This is too general and does not provide specific hardware details like CPU model, GPU model, or memory. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We pick (λ1, λ2) = (10 3, 10 4) for all three datasets. ... We compare the two delay-adaptive step-sizes with γ = h L against the fixed step-size γk = h L(τ+1/2) from Sun et al. (2019); Deng et al. (2020), where h = 0.99 for all three step-sizes (larger step-sizes usually lead to faster convergence). |