reproducibilityindex.ai

Zeroth-Order Negative Curvature Finding: Escaping Saddle Points without Gradients

Authors: Hualin Zhang, Huan Xiong, Bin Gu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Numerical Experiments Octopus Function. We first consider the octopus function proposed by Du et al. [12]. The octopus function has 2d local optimum: x = ( 4τ, . . . , 4τ)T and 2d 1 saddle points: (0, . . . , 0)T, ( 4τ, 0, . . . , 0)T, . . . , ( 4τ, . . . , 4τ, 0)T. We compare ZO-GD-NCF, ZPSGD, PAGD, and RSPI on the octopus function with growing dimensions. The parameters corresponding to the octopus function are set with τ = e, L = e, γ = 1. All algorithms are initialized at point (0, . . . , 0)T, which is a strict saddle point and the one farthest from the optimal points among the 2d 1 saddle points. We set ϵ = 1e 4, δ = ρϵ for all experiments and report the function value v.s. the number of function queries in Figure 1.
Researcher Affiliation	Academia	Hualin Zhang1 Huan Xiong2,3 Bin Gu1,3 1 Nanjing University of Information Science & Technology 2 Harbin Institute of Technology 3 Mohamed bin Zayed University of Artificial Intelligence
Pseudocode	Yes	Algorithm 1 ZO-NCF-Online-Weak (f, x0, δ) ... Algorithm 2 ZO-NCF-Online Input: f( ) = 1/n Pn i=1 fi( ), x0, δ > 0, p (0, 1]. ... Algorithm 3 ZO-NCF-Deterministic Input: Function f( ), point x0, negative curvature δ > 0, confidence p (0, 1]. ... Algorithm 4 ZO-SGD-NCF Input: Function f, starting point x0, confidence p (0, 1), ϵ > 0 and δ > 0. ... Algorithm 5 ZO-GD-NCF Input: Function f, starting point x0, confidence p (0, 1), ϵ > 0 and δ > 0.
Open Source Code	No	The paper states in its reproducibility checklist: 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]', but it does not provide a specific URL or an explicit statement within the main body of the paper indicating where the code is available.
Open Datasets	Yes	We first consider the octopus function proposed by Du et al. [12].
Dataset Splits	No	The paper uses a synthetic benchmark function ("octopus function") for numerical experiments, which does not involve explicit training/validation/test dataset splits in the conventional sense. It specifies parameters for the function and initialization but not data partitioning.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software components or dependencies used in the experiments.
Experiment Setup	Yes	The parameters corresponding to the octopus function are set with τ = e, L = e, γ = 1. All algorithms are initialized at point (0, . . . , 0)T, which is a strict saddle point and the one farthest from the optimal points among the 2d 1 saddle points. We set ϵ = 1e 4, δ = ρϵ for all experiments. For RSPI, we follow the hyperparameter update strategy as described in ([30], Appendix, Section F): We keep σ2 constant and update σ1 = ρσ1σ1 every Tσ1 iterations. We conduct a grid search for Tσ1 and ρσ1.