reproducibilityindex.ai

DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity

Authors: Alexander Tyurin, Peter Richtárik

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, our theory is corroborated in practice: we see a signiﬁcant improvement in experiments with nonconvex classiﬁcation and training of deep learning models. and A Experiments We have tested all developed algorithms on practical machine learnings problems
Researcher Affiliation	Academia	Alexander Tyurin KAUST Saudi Arabia alexandertiurin@gmail.com Peter Richtárik KAUST Saudi Arabia richtarik@gmail.com
Pseudocode	Yes	Algorithm 1 DASHA and Algorithm 2 DASHA-SYNC-MVR
Open Source Code	Yes	Code: https://github.com/mysteryresearcher/dasha
Open Datasets	Yes	We take the mushrooms dataset (dimension d = 112, number of samples equals 8124) from LIBSVM (Chang & Lin, 2011) and CIFAR10 (Krizhevsky et al., 2009)
Dataset Splits	No	The paper mentions splitting data (e.g., 'randomly split the dataset between 5 nodes') and using specific datasets, but does not provide explicit train/test/validation split percentages, absolute counts, or references to predefined splits for reproducibility.
Hardware Specification	Yes	A distributed environment was emulated on a machine with Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz and 64 cores. Deep learning experiments were conducted with NVIDIA A100 GPU with 40GB memory (each deep learning experiment uses at most 5GB of this memory).
Software Dependencies	Yes	The code was written in Python 3.6.8 using Py Torch 1.9 (Paszke et al., 2019).
Experiment Setup	Yes	In all experiments, we take parameters of algorithms predicted by the theory (stated in the convergence rate theorems our paper and in (Gorbunov et al., 2021)), except for the step sizes we ﬁne-tune them using a set of powers of two {2i \| i [ 10, 10]} and use the Rand K compressor.