reproducibilityindex.ai

High Probability Convergence of Stochastic Gradient Methods

Authors: Zijian Liu, Ta Duy Nguyen, Thien Hang Nguyen, Alina Ene, Huy Nguyen

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work, we describe a generic approach to show convergence with high probability for both stochastic convex and non-convex optimization with sub-Gaussian noise. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution. The method can be applied to the non-convex case. We demonstrate an O((1+σ2 log(1/δ))/T +σ/ T) convergence rate when the number of iterations T is known and an O((1 + σ2 log(T/δ))/ T) convergence rate when T is unknown for SGD, where 1 δ is the desired success probability.
Researcher Affiliation	Academia	1Stern School of Business, New York University 2Department of Computer Science, Boston University 3Khoury College of Computer Sciences, Northeastern University. Correspondence to: Zijian Liu <zl3067@nyu.edu>, Ta Duy Nguyen <taduy@bu.edu>, Thien Hang Nguyen <nguyen.thien@northeastern.edu>.
Pseudocode	Yes	Algorithm 1 Stochastic Mirror Descent Algorithm Algorithm 2 Accelerated Stochastic Mirror Descent Algorithm (Lan, 2020). Algorithm 3 Stochastic Gradient Descent (SGD) Algorithm 4 Ada Grad-Norm
Open Source Code	No	The paper provides a link to the full version of the paper on arXiv (https://arxiv.org/abs/2302.14843), but this is not a link to open-source code for the methodology described in the paper. There is no explicit statement about releasing code.
Open Datasets	No	The paper is theoretical and does not use or refer to any datasets for empirical evaluation. Therefore, no information about publicly available or open datasets is provided.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments or dataset usage. Therefore, no information on training, validation, or test dataset splits is provided.
Hardware Specification	No	The paper is purely theoretical and does not involve empirical experiments, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not specify any software dependencies with version numbers for implementation or experimentation.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with specific hyperparameters or system-level training settings.