Faster Gradient-Free Methods for Escaping Saddle Points
Authors: Hualin Zhang, Bin Gu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct several numerical experiments to verify the effectiveness of the proposed methods for escaping saddle points and the efficiency compared with the existing methods. |
| Researcher Affiliation | Academia | Hualin Zhang 1, Bin Gu 1,2 1Nanjing University of Information Science & Technology 2MBZUAI {zhanghualin98, jsgubin}@gmail.com |
| Pseudocode | Yes | Algorithm 1 Zeroth-Order Perturbed Accelerated Gradient Descent, Algorithm 2 Negative Curvature Exploitation (xt, vt, s), Algorithm 3 Zeroth-Order Perturbed Accelerated Gradient Descent with Accelerated Negative Curvature Finding, Algorithm 4 Zeroth-Order Accelerated Negative Curvature Finding without Renormalization( x, r , T ) |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code related to the methodology described. |
| Open Datasets | No | The paper evaluates its methods on a 'cubic regularization problem' and a 'quartic function', both of which are mathematical formulations or synthetic problems, not publicly available datasets with specific access information. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits. It appears to use synthetic functions initialized from a saddle point. |
| Hardware Specification | Yes | All experiments are performed on a computer with a six-core Intel Core i5-10500 CPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for its software dependencies or libraries used in the implementation. |
| Experiment Setup | Yes | In this experiment, we set ϵ = 10 2. ... For Algorithm 1 and 3, the parameter settings basically follow Eq. (5) and Eq. (7). Specifically, we choose ϵ = 0.001 and the perturbation radius r and r are set to 0.001. The Lipschitz constants ℓand ρ are selected based on a coarse grid search of the region {0.1, 1, 10, 100} {0.1, 1, 10, 100}. Table 2: Parameter settings of the cubic regularization problem experiment. Table 3: Parameter settings of the cubic quartic function experiment. |