Stochastic Sign Descent Methods: New Algorithms and Better Theory
Authors: Mher Safaryan, Peter Richtarik
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate several aspects of our theoretical findings with numerical experiments. 5. Experiments We verify several aspects of our theoretical results experimentally using the MNIST dataset with feed-forward neural network (FNN) and the well known Rosenbrock (nonconvex) function with d = 10 variables: |
| Researcher Affiliation | Academia | Mher Safaryan 1 Peter Richtárik 1 2 1KAUST, Saudi Arabia 2MIPT, Russia. |
| Pseudocode | Yes | Algorithm 1 SIGNSGD, Algorithm 2 PARALLEL SIGNSGD W/ MAJORITY VOTE, Algorithm 3 SSDM |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology. |
| Open Datasets | Yes | using the MNIST dataset with feed-forward neural network (FNN) |
| Dataset Splits | No | The paper mentions 'train and test accuracies' and discusses training and testing, but it does not explicitly state the use or size of a validation set or split. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, or cloud instance types used for the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | First column shows train and test accuracies with mini-batch size 128 and averaged over 3 repetitions. We first tuned the constant step size over logarithmic scale {1, 0.1, 0.01, 0.001, 0.0001} and then fine tuned it. Left plot used constant step size γ = 0.02, right plot used variable step size with γ0 = 0.02. We set mini-batch size 1 and used the same initial point. |