reproducibilityindex.ai

How to Escape Saddle Points Efficiently

Authors: Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, Michael I. Jordan

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i.e., it is almost dimension-free ). The convergence rate of this procedure matches the well-known convergence rate of gradient descent to ﬁrst-order stationary points, up to log factors. This paper presents the ﬁrst sharp analysis that shows that (perturbed) gradient descent ﬁnds an approximate second-order stationary point in at most polylog(d) iterations, thus escaping all saddle points efﬁciently.
Researcher Affiliation	Collaboration	1University of California, Berkeley 2Duke University 3Microsoft Research India 4University of Washington.
Pseudocode	Yes	Algorithm 1 Perturbed Gradient Descent (Meta-algorithm); Algorithm 2 Perturbed Gradient Descent: PGD(x0, ℓ, ρ, ϵ, c, δ, f); Algorithm 3 Perturbed Gradient Descent with Local Improvement: PGDli(x0, ℓ, ρ, ϵ, c, δ, f, β)
Open Source Code	No	The paper does not provide any statement about making its source code available or provide a link to a code repository.
Open Datasets	No	The paper is theoretical and does not conduct empirical experiments involving datasets or training. The example of Matrix Factorization is used for theoretical analysis, not empirical evaluation.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments with data splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any empirical experiments, therefore no specific hardware specifications are mentioned.
Software Dependencies	No	The paper focuses on theoretical analysis and algorithm design, not on implementation details that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper focuses on theoretical analysis and does not describe an experimental setup with hyperparameters or system-level training settings.