Global Convergence and Stability of Stochastic Gradient Descent

Authors: Vivak Patel, Shushu Zhang, Bowen Tian

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Then, we develop novel theory to address this shortcoming in two ways. First, we establish that SGD s iterates will either globally converge to a stationary point or diverge under nearly arbitrary nonconvexity and noise models.
Researcher Affiliation Academia Vivak Patel Department of Statistics University of Wisconsin Madison Madison, WI 53706 vivak.patel@wisc.edu Shushu Zhang Department of Statistics University of Michigan Ann Arbor shushuz@umich.edu Bowen Tian Department of Statistics The Ohio State University tian.837@buckeyemail.osu.edu
Pseudocode No The paper describes the SGD rule mathematically (θk+1 = θk Mk f(θk, Xk+1)) but does not provide it in a structured pseudocode or algorithm block.
Open Source Code No The paper does not contain any statements or links indicating the provision of open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not involve empirical studies with data, so there are no datasets used in the context of training for which access information would be provided.
Dataset Splits No The paper is theoretical and does not involve empirical studies with data, so there are no dataset splits for validation described.
Hardware Specification No The paper is theoretical and does not report on experimental hardware specifications.
Software Dependencies No The paper is theoretical and does not describe experimental setup or software dependencies with specific version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with specific hyperparameters or training configurations.