Stochastic Sign Descent Methods: New Algorithms and Better Theory

Authors: Mher Safaryan, Peter Richtarik

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate several aspects of our theoretical findings with numerical experiments. 5. Experiments We verify several aspects of our theoretical results experimentally using the MNIST dataset with feed-forward neural network (FNN) and the well known Rosenbrock (nonconvex) function with d = 10 variables:
Researcher Affiliation Academia Mher Safaryan 1 Peter Richtárik 1 2 1KAUST, Saudi Arabia 2MIPT, Russia.
Pseudocode Yes Algorithm 1 SIGNSGD, Algorithm 2 PARALLEL SIGNSGD W/ MAJORITY VOTE, Algorithm 3 SSDM
Open Source Code No The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets Yes using the MNIST dataset with feed-forward neural network (FNN)
Dataset Splits No The paper mentions 'train and test accuracies' and discusses training and testing, but it does not explicitly state the use or size of a validation set or split.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, or cloud instance types used for the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes First column shows train and test accuracies with mini-batch size 128 and averaged over 3 repetitions. We first tuned the constant step size over logarithmic scale {1, 0.1, 0.01, 0.001, 0.0001} and then fine tuned it. Left plot used constant step size γ = 0.02, right plot used variable step size with γ0 = 0.02. We set mini-batch size 1 and used the same initial point.