DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity
Authors: Alexander Tyurin, Peter Richtárik
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, our theory is corroborated in practice: we see a significant improvement in experiments with nonconvex classification and training of deep learning models. and A Experiments We have tested all developed algorithms on practical machine learnings problems |
| Researcher Affiliation | Academia | Alexander Tyurin KAUST Saudi Arabia alexandertiurin@gmail.com Peter Richtárik KAUST Saudi Arabia richtarik@gmail.com |
| Pseudocode | Yes | Algorithm 1 DASHA and Algorithm 2 DASHA-SYNC-MVR |
| Open Source Code | Yes | Code: https://github.com/mysteryresearcher/dasha |
| Open Datasets | Yes | We take the mushrooms dataset (dimension d = 112, number of samples equals 8124) from LIBSVM (Chang & Lin, 2011) and CIFAR10 (Krizhevsky et al., 2009) |
| Dataset Splits | No | The paper mentions splitting data (e.g., 'randomly split the dataset between 5 nodes') and using specific datasets, but does not provide explicit train/test/validation split percentages, absolute counts, or references to predefined splits for reproducibility. |
| Hardware Specification | Yes | A distributed environment was emulated on a machine with Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz and 64 cores. Deep learning experiments were conducted with NVIDIA A100 GPU with 40GB memory (each deep learning experiment uses at most 5GB of this memory). |
| Software Dependencies | Yes | The code was written in Python 3.6.8 using Py Torch 1.9 (Paszke et al., 2019). |
| Experiment Setup | Yes | In all experiments, we take parameters of algorithms predicted by the theory (stated in the convergence rate theorems our paper and in (Gorbunov et al., 2021)), except for the step sizes we fine-tune them using a set of powers of two {2i | i [ 10, 10]} and use the Rand K compressor. |