Escaping Saddle Points in Constrained Optimization

Authors: Aryan Mokhtari, Asuman Ozdaglar, Ali Jadbabaie

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we study the problem of escaping from saddle points in smooth nonconvex optimization problems subject to a convex set C. We propose a generic framework that yields convergence to a second-order stationary point of the problem, if the convex set C is simple for a quadratic objective function. Specifically, our results hold if one can find a -approximate solution of a quadratic program subject to C in polynomial time, where < 1 is a positive constant that depends on the structure of the set C. Under this condition, we show that the sequence of iterates generated by the proposed framework reaches an ( , γ)-second order stationary point (SOSP) in at most O(max{ 2, 3γ 3}) iterations. We further characterize the overall complexity of reaching an SOSP when the convex set C can be written as a set of quadratic constraints and the objective function Hessian has a specific structure over the convex set C. Finally, we extend our results to the stochastic setting and characterize the number of stochastic gradient and Hessian evaluations to reach an ( , γ)-SOSP.
Researcher Affiliation Academia Aryan Mokhtari MIT Cambridge, MA 02139 aryanm@mit.edu Asuman Ozdaglar MIT Cambridge, MA 02139 asuman@mit.edu Ali Jadbabaie MIT Cambridge, MA 02139 jadbabai@mit.edu
Pseudocode Yes Algorithm 1 Generic framework for escaping saddles in constrained optimization
Open Source Code No The paper does not provide any explicit statements about open-source code availability for the described methodology, nor does it include links to a code repository.
Open Datasets No The paper is theoretical and does not conduct empirical experiments using a dataset, therefore no information about public dataset availability for training is provided.
Dataset Splits No The paper is theoretical and does not conduct empirical experiments, thus there is no mention of dataset splits (training, validation, test).
Hardware Specification No The paper is purely theoretical and does not report on empirical experiments, therefore no specific hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not report on empirical experiments, so no specific software dependencies with version numbers are mentioned.
Experiment Setup No The paper is theoretical and does not describe empirical experiments, thus no experimental setup details such as hyperparameters or training settings are provided.