ANYTIME MINIBATCH: EXPLOITING STRAGGLERS IN ONLINE DISTRIBUTED OPTIMIZATION
Authors: Nuwan Ferdinand, Haider Al-Lawati, Stark Draper, Matthew Nokleby
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present a convergence analysis and analyze the wall time performance. Our numerical results show that our approach is up to 1.5 times faster in Amazon EC2 and it is up to five times faster when there is greater variability in compute node performance.To evaluate the performance of AMB and compare it with that of FMB, we ran several experiments on Amazon EC2 for both schemes to solve two different classes of machine learning tasks: linear regression and logistic regression using both synthetic and real datasets. |
| Researcher Affiliation | Academia | Nuwan Ferdinand, Haider Al-Lawati, & Stark Draper Department of Electrical and Computer Engineering, University of Toronto {nuwan.ferdinand@,haider.al.lawati@mail.,stark.draper@}utoronto.ca Matthew Nokleby Department of Electrical and Computer Engineering, Wayne State University, Detroit, MI matthew.nokleby@wayne.edu |
| Pseudocode | Yes | The pseudo code of the algorithm is provided in App. A. Algorithm 1 AMB Algorithm |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, such as a specific repository link, an explicit code release statement, or mention of code in supplementary materials. |
| Open Datasets | Yes | For the logistic regression problem, we used the MNIST images of numbers from 0 to 9. We used MNIST training dataset that consists of 60,000 data points. |
| Dataset Splits | No | The paper mentions using training datasets (e.g., MNIST training dataset) but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | Yes | In all our experiments, we used t2.micro instances and ami-6b211202, a publicly available Amazon Machine Image (AMI), to launch the instances. |
| Software Dependencies | No | Communication between nodes were handled through Message Passing Interface (MPI). No specific version number is provided for MPI or any other software dependency. |
| Experiment Setup | Yes | In FMB, each worker computed b = 6000 gradients. The average compute time during the steady-state phase was found to be 14.5 sec. Therefore, in AMB case, the compute time for each worker was set to be T = 14.5 sec. and we set Tc = 4.5 sec. Workers are allowed r = 5 average rounds of consensus to average their calculated gradients.The per-node fixed minibatch in FMB is b/n = 800 while the fixed compute time in AMB is T = 12 sec. and the communication time Tc = 3 sec. As in the linear regression experiment above, the workers on average go through r = 5 round of consensus. |