Linearly Converging Error Compensated SGD
Authors: Eduard Gorbunov, Dmitry Kovalev, Dmitry Makarenko, Peter Richtarik
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To justify our theory, we conduct several numerical experimentson logistic regression problem with ℓ2-regularization: i=1 log (1 + exp ( yi (Ax)i)) + µ 2 x 2 , (23) where N is a number of features, x Rd represents the weights of the model, A RN d is a feature matrix, vector y { 1, 1}N is a vector of labels and (Ax)i denotes the i-th component of vector Ax. Clearly, this problem is L-smooth and µ-strongly convex with L = µ + λmax(A A)/4N, where λmax(A A) is a largest eigenvalue of A A. The datasets were taken from LIBSVM library [8], and the code was written in Python 3.7 using standard libraries. Our code is available at https://github.com/eduardgorbunov/ef_sigma_k. |
| Researcher Affiliation | Collaboration | Eduard Gorbunov MIPT, Yandex and Sirius, Russia KAUST, Saudi Arabia Dmitry Kovalev KAUST, Saudi Arabia Dmitry Makarenko MIPT, Russia Peter Richtárik KAUST, Saudi Arabia |
| Pseudocode | Yes | Table 2: Error compensated methods developed in this paper. In all cases, vk i = C(ek i + γgk i ). The full descriptions of the algorithms are included in the appendix. |
| Open Source Code | Yes | Our code is available at https://github.com/eduardgorbunov/ef_sigma_k. |
| Open Datasets | Yes | The datasets were taken from LIBSVM library [8], and the code was written in Python 3.7 using standard libraries. |
| Dataset Splits | No | The paper does not specify percentages or counts for training, validation, and test splits, nor does it explicitly mention a validation split. |
| Hardware Specification | Yes | We simulate parameter-server architecture using one machine with Intel(R) Core(TM) i7-9750 CPU 2.60 GHz in the following way. |
| Software Dependencies | No | The paper states: "The code was written in Python 3.7 using standard libraries." While it mentions Python 3.7, it does not specify version numbers for any key ancillary libraries or solvers, which are required for reproducibility. |
| Experiment Setup | Yes | In all experiments we use the stepsize γ = 1/L and ℓ2-regularization parameter µ = 10 4λmax(A A)/4N. The starting point x0 for each dataset was chosen so that f(x0) f(x ) 10. In experiments with stochastic methods we used batches of size 1 and uniform sampling for simplicity. For LSVRG-type methods we choose p = 1/m. |