Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Almost Sure Convergence Rates Analysis and Saddle Avoidance of Stochastic Gradient Methods

Authors: Jun Liu, Ye Yuan

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	The vast majority of convergence rates analysis for stochastic gradient methods in the literature focus on convergence in expectation, whereas trajectory-wise almost sure convergence is clearly important to ensure that any instantiation of the stochastic algorithms would converge with probability one. Here we provide a uniﬁed almost sure convergence rates analysis for stochastic gradient descent (SGD), stochastic heavy-ball (SHB), and stochastic Nesterov s accelerated gradient (SNAG) methods. We show, for the ﬁrst time, that the almost sure convergence rates obtained for these stochastic gradient methods on strongly convex functions, are arbitrarily close to their optimal convergence rates possible. For non-convex objective functions, we not only show that a weighted average of the squared gradient norms converges to zero almost surely, but also the last iterates of the algorithms. We further provide last-iterate almost sure convergence rates analysis for stochastic gradient methods on general convex smooth functions, in contrast with most existing results in the literature that only provide convergence in expectation for a weighted average of the iterates. The last-iterate almost sure convergence results also enable us to obtain almost sure avoidance of any strict saddle manifold by stochastic gradient methods with or without momentum.
Researcher Affiliation	Academia	Jun Liu EMAIL Department of Applied Mathematics, University of Waterloo, Waterloo, Canada; Ye Yuan EMAIL School of Artiﬁcial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
Pseudocode	Yes	The iteration of the SGD method is given by xt+1 = xt αtgt, t 1, (11)... The iteration of the SHB method is given by xt+1 = xt αtgt + β(xt xt 1), t 1, (16)... The iteration of the SNAG method is given by yt+1 = xt αtgt, xt+1 = yt+1 + β(yt+1 yt), t 1, (24)
Open Source Code	No	The paper does not contain any explicit statements about the release of open-source code, nor does it provide links to any code repositories.
Open Datasets	No	The paper refers to popular non-convex optimization benchmark functions like the Griewank, Rastrigin, and Levy functions, and provides a general URL (https://www.sfu.ca/~ssurjano/optimization.html) describing them. However, it does not use or provide access information for any specific experimental datasets for empirical evaluation.
Dataset Splits	No	As the paper is theoretical and does not describe experiments using specific datasets, there is no mention of training/test/validation dataset splits.
Hardware Specification	No	The paper does not describe any experimental results or the hardware used for computation. It is a theoretical paper focusing on convergence analysis.
Software Dependencies	No	The paper does not specify any software dependencies or version numbers, consistent with its theoretical nature.
Experiment Setup	No	The paper focuses on theoretical analysis and does not describe any empirical experimental setup, hyperparameters, or training configurations.