Riemannian stochastic optimization methods avoid strict saddle points
Authors: Ya-Ping Hsieh, Mohammad Reza Karimi Jaghargh, Andreas Krause, Panayotis Mertikopoulos
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we aim to demonstrate the practical applicability of the theoretical framework proposed in our paper by providing numerical illustrations. To do so, we utilize a 2-dimensional torus as the optimization landscape, where the complexity and multi-modal nature of the objective function can be easily visualized in Fig. 2. Our objective function features three saddle points (in black) and one global minimizer (in red). We subject two RRM schemes, i.e., (RSGD) and (RSEG), to initialization in proximity to these saddle points. This strategic choice rigorously tests their ability to navigate and converge to the global optimum. In line with our theoretical predictions, both RRM methods avoid the saddle points and eventually reach the desirable global minimum. This empirical confirmation reinforces the central message of our paper: stochastic Riemannian schemes only converge to local minimizers. |
| Researcher Affiliation | Academia | Ya-Ping Hsieh, Mohammad Reza Karimi, Andreas Krause ETH Zürich {yaping.hsieh, mkarimi}@inf.ethz.ch, krausea@ethz.ch Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG & Archimedes RU, NKUA panayotis.mertikopoulos@imag.fr |
| Pseudocode | Yes | Algorithm 1 (Riemannian stochastic gradient descent). Algorithm 2 (Retraction-based stochastic gradient descent). Algorithm 3 (Stochastic mirror descent). Algorithm 4 (Riemannian optimistic gradient). Algorithm 5 (Natural gradient descent). Algorithm 6 (Riemannian proximal point methods). Algorithm 7 (Riemannian stochastic extra-gradient). |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the methods described, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper uses a "2-dimensional torus as the optimization landscape" for numerical illustrations, which is a synthetic landscape, not a publicly available dataset with concrete access information (e.g., URL, DOI, citation). |
| Dataset Splits | No | The paper describes numerical illustrations on a "2-dimensional torus as the optimization landscape" and does not specify any training, validation, or test splits for data. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications) used for running the numerical illustrations. |
| Software Dependencies | No | The paper does not provide specific software dependencies (e.g., library names with version numbers like PyTorch 1.9 or Python 3.8) used for the numerical illustrations. |
| Experiment Setup | No | The paper states: "We subject two RRM schemes, i.e., (RSGD) and (RSEG), to initialization in proximity to these saddle points." This is a minimal detail about the experimental setup but lacks concrete hyperparameters or other system-level training settings. |