reproducibilityindex.ai

Convergence of Cubic Regularization for Nonconvex Optimization under KL Property

Authors: Yi Zhou, Zhe Wang, Yingbin Liang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we explore the asymptotic convergence rate of CR by exploiting the ubiquitous Kurdyka-Łojasiewicz (KŁ) property of nonconvex objective functions. In speciﬁc, we characterize the asymptotic convergence rate of various types of optimality measures for CR including function value gap, variable distance gap, gradient norm and least eigenvalue of the Hessian matrix. Our results fully characterize the diverse convergence behaviors of these optimality measures in the full parameter regime of the KŁ property. Moreover, we show that the obtained asymptotic convergence rates of CR are order-wise faster than those of ﬁrst-order gradient descent algorithms under the KŁ property.
Researcher Affiliation	Academia	Yi Zhou Department of ECE The Ohio State University zhou.1172@osu.edu Zhe Wang Department of ECE The Ohio State University wang.10982@osu.edu Yingbin Liang Department of ECE The Ohio State University liang.889@osu.edu
Pseudocode	No	No pseudocode or clearly labeled algorithm block was found. The algorithm update rule is presented as a mathematical equation (1).
Open Source Code	No	The paper does not contain any statement or link indicating that open-source code for the described methodology is provided.
Open Datasets	No	The paper is purely theoretical and does not involve experiments with datasets.
Dataset Splits	No	The paper is purely theoretical and does not involve experiments with datasets or specify any training/validation/test splits.
Hardware Specification	No	The paper describes theoretical work and does not mention any specific hardware used for experiments.
Software Dependencies	No	The paper describes theoretical work and does not specify any software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not include details on experimental setup such as hyperparameters or system-level training settings.