reproducibilityindex.ai

Heavy-ball Algorithms Always Escape Saddle Points

Authors: Tao Sun, Dongsheng Li, Zhe Quan, Hao Jiang, Shengguo Li, Yong Dou

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we answer a question: can the nonconvex heavy-ball algorithms with random initialization avoid saddle points? The answer is yes! Direct using the existing proof technique for the heavy-ball algorithms is hard due to that each iteration of the heavy-ball algorithm consists of current and last points. It is impossible to formulate the algorithms as iteration like xk+1 = g(xk) under some mapping g. To this end, we design a new mapping on a new space. With some transfers, the heavy-ball algorithm can be interpreted as iterations after this mapping. Theoretically, we prove that heavy-ball gradient descent enjoys larger stepsize than the gradient descent to escape saddle points to escape the saddle point.
Researcher Affiliation	Academia	Tao Sun1 , Dongsheng Li1 , Zhe Quan2 , Hao Jiang1 , Shengguo Li1 and Yong Dou1 1 Computer of College, National University of Defense Technology 2 Hunan University nudtsuntao@163.com, dsli@nudt.edu.cn, quanzhe@hnu.edu.cn,{haojiang, nudtlsg, yongdou}@nudt.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. The algorithms are presented as mathematical equations.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It is a theoretical paper with no mention of code release.
Open Datasets	No	The paper is theoretical and does not conduct experiments on datasets, thus it does not provide access information for a publicly available or open dataset for training.
Dataset Splits	No	The paper is theoretical and does not conduct experiments on datasets, thus it does not provide specific dataset split information for validation.
Hardware Specification	No	The paper is theoretical and does not describe any experiments that would require specific hardware. No hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not conduct experiments, therefore it does not list specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any empirical experiments, so no specific experimental setup details like hyperparameters or training configurations are provided.