MARINA: Faster Non-Convex Distributed Learning with Compression
Authors: Eduard Gorbunov, Konstantin P. Burlachenko, Zhize Li, Peter Richtarik
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct several numerical experiments to justify the theoretical claims of the paper. |
| Researcher Affiliation | Collaboration | 1Moscow Institute of Physics and Technology, Moscow, Russia 2Yandex, Moscow, Russia 3King Abdullah University of Science and Technology, Thuwal, Saudi Arabia. |
| Pseudocode | Yes | Algorithm 1 MARINA |
| Open Source Code | Yes | Our code is available at https://github.com/burlachenkok/marina. |
| Open Datasets | Yes | binary classification problem involving non-convex loss (11) with Lib SVM data (Chang & Lin, 2011)... training Res Net-18 (He et al., 2016) at CIFAR100 (Krizhevsky et al., 2009) dataset. |
| Dataset Splits | No | The paper mentions using specific datasets but does not provide explicit train/validation/test splits, percentages, or sample counts for reproducibility. Standard splits might be implied for well-known datasets, but they are not stated. |
| Hardware Specification | No | The paper states 'The distributed environment is simulated' but does not provide any specific hardware details such as GPU/CPU models or memory specifications used for these simulations. |
| Software Dependencies | Yes | The distributed environment is simulated in Python 3.8 using MPI4PY and other standard libraries... The code is wrtitten in Python 3.9 using Py Torch 1.7 |
| Experiment Setup | Yes | Stepsizes for the methods are chosen according to the theory and the batchsizes for VR-MARINA and VR-DIANA are m/100... In all cases, we used the Rand K sparsification operator with K {1, 5, 10}... Number of workers equals 5. Stepsizes for the methods were tuned and the batchsizes are m/50. |