Convergence of Cubic Regularization for Nonconvex Optimization under KL Property
Authors: Yi Zhou, Zhe Wang, Yingbin Liang
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we explore the asymptotic convergence rate of CR by exploiting the ubiquitous Kurdyka-Łojasiewicz (KŁ) property of nonconvex objective functions. In specific, we characterize the asymptotic convergence rate of various types of optimality measures for CR including function value gap, variable distance gap, gradient norm and least eigenvalue of the Hessian matrix. Our results fully characterize the diverse convergence behaviors of these optimality measures in the full parameter regime of the KŁ property. Moreover, we show that the obtained asymptotic convergence rates of CR are order-wise faster than those of first-order gradient descent algorithms under the KŁ property. |
| Researcher Affiliation | Academia | Yi Zhou Department of ECE The Ohio State University zhou.1172@osu.edu Zhe Wang Department of ECE The Ohio State University wang.10982@osu.edu Yingbin Liang Department of ECE The Ohio State University liang.889@osu.edu |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block was found. The algorithm update rule is presented as a mathematical equation (1). |
| Open Source Code | No | The paper does not contain any statement or link indicating that open-source code for the described methodology is provided. |
| Open Datasets | No | The paper is purely theoretical and does not involve experiments with datasets. |
| Dataset Splits | No | The paper is purely theoretical and does not involve experiments with datasets or specify any training/validation/test splits. |
| Hardware Specification | No | The paper describes theoretical work and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper describes theoretical work and does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not include details on experimental setup such as hyperparameters or system-level training settings. |