Permutation Compressors for Provably Faster Distributed Nonconvex Optimization
Authors: Rafał Szlendak, Alexander Tyurin, Peter Richtárik
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We corroborate our theoretical results with carefully engineered synthetic experiments with minimizing the average of nonconvex quadratics, and on autoencoder training with the MNIST dataset. |
| Researcher Affiliation | Academia | Rafał Szlendak KAUST Saudi Arabia Alexander Tyurin KAUST Saudi Arabia Peter Richtárik KAUST Saudi Arabia |
| Pseudocode | Yes | Algorithm 1 MARINA |
| Open Source Code | No | The paper discusses implementation details (Appendix J) but does not provide a specific repository link or an explicit statement about releasing the source code for their methodology. |
| Open Datasets | Yes | training with the MNIST dataset. (Le Cun et al., 2010) |
| Dataset Splits | Yes | Initially, we randomly split MNIST into n + 1 parts: D0, D1, , Dn, where n = 1000 is the number of nodes. Then, for all i {1, . . . , n}, the ith node takes split D0 with probability bp, or split Di with probability 1 bp. We define the chosen split as c Di. |
| Hardware Specification | Yes | All methods are implemented in Python 3.6 and run on a machine with 24 Intel(R) Xeon(R) Gold 6146 CPU @ 3.20GHz cores with 32-bit precision. |
| Software Dependencies | No | All methods are implemented in Python 3.6. |
| Experiment Setup | Yes | We take MARINA s and EF21 s parameters prescribed by the theory and performed a grid search for the step sizes for each compressor by multiplying the theoretical ones with powers of two. We fix λ = 1e-6, and dimension d = 1000 (see Figure 1). We then generated optimization tasks with the number of nodes n {10, 1000, 10000} and L {0, 0.05, 0.1, 0.21, 0.91}. |