STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization
Authors: Kfir Levy, Ali Kavis, Volkan Cevher
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work we propose STORM+, a new method that is completely parameter-free, does not require large batch-sizes, and obtains the optimal O(1/T 1/3) rate for finding an approximate stationary point. Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters. |
| Researcher Affiliation | Academia | Kfir Y. Levy Technion kfirylevy@technion.ac.il Ali Kavis EPFL ali.kavis@epfl.ch Volkan Cevher EPFL volkan.cevher@epfl.ch |
| Pseudocode | Yes | We describe our method in Alg. 1 and Eq. (8), and state its guarantees in Theorem 1. For completeness we present our method in Alg. 1 |
| Open Source Code | No | The paper does not mention providing open-source code for the described methodology. It focuses purely on theoretical contributions and algorithmic development without any implementation details or links. |
| Open Datasets | No | This is a theoretical paper and does not involve experiments or the use of datasets for training. |
| Dataset Splits | No | This is a theoretical paper and does not involve experiments or the use of datasets, thus no training/validation/test splits are mentioned. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments, thus no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not describe any experiments or implementations requiring specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and convergence analysis. It does not describe any experimental setup details such as hyperparameters or training configurations. |