NEON2: Finding Local Minima via First-Order Oracles

Authors: Zeyuan Allen-Zhu, Yuanzhi Li

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We propose a reduction for non-convex optimization that can (1) turn an stationary-point finding algorithm into an local-minimum finding one, and (2) replace the Hessian-vector product computations with only gradient computations. It works both in the stochastic and the deterministic settings, without hurting the algorithm s performance. As applications, our reduction turns Natasha2 into a first-order method without hurting its theoretical performance. It also converts SGD, GD, SCSG, and SVRG into algorithms finding approximate local minima, outperforming some best known results. Table 1: Complexity for finding f(x) ε and 2f(x) δI. Following tradition, in these complexity bounds, we assume variance and smoothness parameters as constants, and only show the dependency on n, d, ε.
Researcher Affiliation Collaboration Zeyuan Allen-Zhu Microsoft Research AI Redmond, WA 98052 zeyuan@csail.mit.edu Yuanzhi Li Stanford University Stanford, CA 94305 yuanzhil@stanford.edu
Pseudocode Yes Algorithm 1 Neon2online weak (f, x0, δ) Algorithm 2 Neon2online(f, x0, δ, p) Algorithm 3 Neon2det(f, x0, δ, p)
Open Source Code No The paper does not provide an explicit statement about the release of source code for the described methodology or a link to a code repository. The only link provided (https://arxiv.org/abs/1711.06673) is to the full version of the paper on arXiv.
Open Datasets No The paper is theoretical and focuses on algorithm design and analysis. It does not mention using specific datasets for training or evaluation, nor does it provide access information for any dataset.
Dataset Splits No The paper is theoretical and does not report on empirical experiments with dataset splits. Therefore, no training/validation/test split information is provided.
Hardware Specification No The paper is theoretical and does not report on empirical experiments; therefore, no specific hardware specifications for running experiments are mentioned.
Software Dependencies No The paper is theoretical and focuses on algorithm design and analysis. It does not mention any specific software dependencies or version numbers required to replicate the work.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with specific hyperparameters or system-level training settings.