reproducibilityindex.ai

Riemannian Stochastic Recursive Momentum Method for non-Convex Optimization

Authors: Andi Han, Junbin Gao

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiment results demonstrate the superiority of the proposed algorithm. In this section, we compare our proposed RSRM with other one-sample online methods.
Researcher Affiliation	Academia	Andi Han , Junbin Gao Discipline of Business Analytics, The University of Sydney {andi.han, junbin.gao}@sydney.edu.au
Pseudocode	Yes	Algorithm 1 Riemannian SRM
Open Source Code	No	The paper does not provide an explicit statement about releasing the source code for the proposed methodology or a link to a code repository.
Open Datasets	Yes	MNIST [Le Cun et al., 1998], COVTYPE from Lib SVM [Chang and Lin, 2011], YALEB [Wright et al., 2008], CIFAR100 [Krizhevsky et al., 2009], COIL100 [Nene et al., 1996], KYLBERG [Kylberg, 2014].
Dataset Splits	No	The paper mentions mini-batch sizes for training but does not provide specific details about validation dataset splits (e.g., percentages, counts, or a dedicated validation set methodology).
Hardware Specification	Yes	All algorithms are coded in Matlab and experiments are conducted on a laptop with a i5-8600 3.1GHz CPU processor.
Software Dependencies	No	The paper states 'All algorithms are coded in Matlab' but does not specify a version number for Matlab or any other software dependencies.
Experiment Setup	Yes	For competing methods, we consider a square-root decaying step size ηt = η0t 1/2, suggested in [Kasai et al., 2019]. We set the parameters of RSRM according to the theory, i.e. ηt = η0t 1/3 and ρt = ρ0t 2/3. A default value of ρ0 = 0.1 provides good empirical performance. For all methods, η0 are selected from {1, 0.5, 0.1, ..., 0.005, 0.001}. The gradient momentum parameter in c SGD-M and RAMSGRAD is set to be 0.9 and the adaptation momentum parameter in c RMSProp, RAMSGRAD and RASA is set to be 0.999. We choose a mini-batch size of 5 for RSRM and 10 for all other algorithms to ensure an identical per-iteration cost of gradient evaluation. The initial batch size for RSRM is ﬁxed to be 100 (except for the problem of ICA where it is set to be 200).