Riemannian Stochastic Recursive Momentum Method for non-Convex Optimization
Authors: Andi Han, Junbin Gao
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiment results demonstrate the superiority of the proposed algorithm. In this section, we compare our proposed RSRM with other one-sample online methods. |
| Researcher Affiliation | Academia | Andi Han , Junbin Gao Discipline of Business Analytics, The University of Sydney {andi.han, junbin.gao}@sydney.edu.au |
| Pseudocode | Yes | Algorithm 1 Riemannian SRM |
| Open Source Code | No | The paper does not provide an explicit statement about releasing the source code for the proposed methodology or a link to a code repository. |
| Open Datasets | Yes | MNIST [Le Cun et al., 1998], COVTYPE from Lib SVM [Chang and Lin, 2011], YALEB [Wright et al., 2008], CIFAR100 [Krizhevsky et al., 2009], COIL100 [Nene et al., 1996], KYLBERG [Kylberg, 2014]. |
| Dataset Splits | No | The paper mentions mini-batch sizes for training but does not provide specific details about validation dataset splits (e.g., percentages, counts, or a dedicated validation set methodology). |
| Hardware Specification | Yes | All algorithms are coded in Matlab and experiments are conducted on a laptop with a i5-8600 3.1GHz CPU processor. |
| Software Dependencies | No | The paper states 'All algorithms are coded in Matlab' but does not specify a version number for Matlab or any other software dependencies. |
| Experiment Setup | Yes | For competing methods, we consider a square-root decaying step size ηt = η0t 1/2, suggested in [Kasai et al., 2019]. We set the parameters of RSRM according to the theory, i.e. ηt = η0t 1/3 and ρt = ρ0t 2/3. A default value of ρ0 = 0.1 provides good empirical performance. For all methods, η0 are selected from {1, 0.5, 0.1, ..., 0.005, 0.001}. The gradient momentum parameter in c SGD-M and RAMSGRAD is set to be 0.9 and the adaptation momentum parameter in c RMSProp, RAMSGRAD and RASA is set to be 0.999. We choose a mini-batch size of 5 for RSRM and 10 for all other algorithms to ensure an identical per-iteration cost of gradient evaluation. The initial batch size for RSRM is fixed to be 100 (except for the problem of ICA where it is set to be 200). |