Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Escape saddle points by a simple gradient-descent based algorithm
Authors: Chenyi Zhang, Tongyang Li
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we also perform numerical experiments that support our results. We perform our negative curvature ο¬nding algorithms using GD or SGD in various landscapes and general classes of nonconvex functions, and use comparative studies to show that our Algorithm 1 and Algorithm 3 achieve a higher probability of escaping saddle points using much fewer iterations than PGD and PSGD (typically less than 1/3 times of the iteration number of PGD and 1/2 times of the iteration number of PSGD, respectively). Moreover, we perform numerical experiments benchmarking the solution quality and iteration complexity of our algorithm against accelerated methods. |
| Researcher Affiliation | Academia | 1 Institute for Interdisciplinary Information Sciences, Tsinghua University, China 2 Center on Frontiers of Computing Studies, Peking University, China 3 School of Computer Science, Peking University, China 4 Center for Theoretical Physics, Massachusetts Institute of Technology, USA |
| Pseudocode | Yes | Algorithm 1: Negative Curvature Finding( x, r, T ). Algorithm 2: Perturbed Accelerated Gradient Descent with Accelerated Negative Curvature Finding(x0, , gamma, s, T 0, r0) Algorithm 3: Stochastic Negative Curvature Finding(x0, rs, Ts, m). |
| Open Source Code | Yes | All the experiments are performed on MATLAB R2019b on a computer with Six-Core Intel Core i7 processor and 16GB memory, and their codes are given in the supplementary material. |
| Open Datasets | No | The paper uses synthetic test functions (e.g., f(x1, x2) = 1/16 * x1^4 - 1/2 * x2^2) rather than publicly available datasets for its experiments. Therefore, there is no information about publicly available training data. |
| Dataset Splits | No | The paper focuses on theoretical convergence rates and numerical experiments on synthetic functions, and thus does not describe traditional training/validation/test dataset splits. |
| Hardware Specification | Yes | All the experiments are performed on MATLAB R2019b on a computer with Six-Core Intel Core i7 processor and 16GB memory |
| Software Dependencies | Yes | All the experiments are performed on MATLAB R2019b |
| Experiment Setup | Yes | Parameters: = 0.05 (step length), r = 0.1 (ball radius in PGD and parameter r in Algorithm 1), M = 300 (number of samplings). Parameters: = 0.02 (step length), r = 0.01 (variance in PSGD and rs in Algorithm 3), M = 300 (number of samplings). |