Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
High Probability Convergence of Stochastic Gradient Methods
Authors: Zijian Liu, Ta Duy Nguyen, Thien Hang Nguyen, Alina Ene, Huy Nguyen
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we describe a generic approach to show convergence with high probability for both stochastic convex and non-convex optimization with sub-Gaussian noise. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution. The method can be applied to the non-convex case. We demonstrate an O((1+σ2 log(1/δ))/T +σ/ T) convergence rate when the number of iterations T is known and an O((1 + σ2 log(T/δ))/ T) convergence rate when T is unknown for SGD, where 1 δ is the desired success probability. |
| Researcher Affiliation | Academia | 1Stern School of Business, New York University 2Department of Computer Science, Boston University 3Khoury College of Computer Sciences, Northeastern University. Correspondence to: Zijian Liu <EMAIL>, Ta Duy Nguyen <EMAIL>, Thien Hang Nguyen <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Stochastic Mirror Descent Algorithm Algorithm 2 Accelerated Stochastic Mirror Descent Algorithm (Lan, 2020). Algorithm 3 Stochastic Gradient Descent (SGD) Algorithm 4 Ada Grad-Norm |
| Open Source Code | No | The paper provides a link to the full version of the paper on arXiv (https://arxiv.org/abs/2302.14843), but this is not a link to open-source code for the methodology described in the paper. There is no explicit statement about releasing code. |
| Open Datasets | No | The paper is theoretical and does not use or refer to any datasets for empirical evaluation. Therefore, no information about publicly available or open datasets is provided. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments or dataset usage. Therefore, no information on training, validation, or test dataset splits is provided. |
| Hardware Specification | No | The paper is purely theoretical and does not involve empirical experiments, thus no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not specify any software dependencies with version numbers for implementation or experimentation. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with specific hyperparameters or system-level training settings. |