Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Gaussian Approximation and Concentration of Constant Learning-Rate Stochastic Gradient Descent
Authors: Ziyang Wei, Jiaqi Li, Zhipeng Lou, Wei Biao Wu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our conditions are substantially milder than those required in the classical CLTs for SGD, yet offering a stronger convergence result. Furthermore, we derive the first Berry-Esseen bound the Gaussian approximation error for the constant learning-rate SGD, which is sharp compared to the decaying learning-rate schemes in the literature. Beyond the moment convergence, we also provide the Nagaev-type inequality for the SGD tail probabilities by adopting the autoregressive approximation techniques, which entails non-asymptotic largedeviation guarantees. These results are verified via numerical simulations, paving the way for theoretically grounded uncertainty quantification, especially with non-asymptotic validity. |
| Researcher Affiliation | Academia | Ziyang Wei Department of Statistics University of Chicago Chicago, IL 60637 EMAIL Jiaqi Li Department of Statistics University of Chicago Chicago, IL 60637 EMAIL Zhipeng Lou Department of Mathematics University of California, San Diego La Jolla, CA 92093 EMAIL Wei Biao Wu Department of Statistics University of Chicago Chicago, IL 60637 EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes mathematical methods and proves theorems. |
| Open Source Code | Yes | Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide them in our supplementary materials. |
| Open Datasets | No | Consider the following data generating mechanism for the logistic regression model: Xi = (ai, bi), i = 1, 2, ... are i.i.d. random vectors where ai are generated from a 5-dimensional independent t distribution with degrees of freedom df = 3. bi {1, 1} follows a Bernoulli distribution with the probability given by P(bi|ai) = 1/(1 + exp( bia i θ )). |
| Dataset Splits | No | The paper describes a data generation mechanism for numerical simulations and states "We run 1000 independent trials with n = 500000 and γ = 0.005, 0.001, 0.0002." It does not refer to a pre-existing dataset with specific train/test/validation splits, as the data is generated synthetically per trial. |
| Hardware Specification | Yes | We conducted the experiments in R version 4.3.1 (2023-06-16) on a Mac Book Air with a GPU Apple M1, 4 performance and 4 efficiency cores, and 8 GB LPDDR4 memory, equipped with mac OS Big Sur version 11.5.1. |
| Software Dependencies | Yes | We conducted the experiments in R version 4.3.1 (2023-06-16)... |
| Experiment Setup | Yes | We run 1000 independent trials with n = 500000 and γ = 0.005, 0.001, 0.0002. Since the number of iterations is large enough, the main contribution in the tail probability bounds are the polynomial terms and the sub-Gaussian term, i.e., in the Markov-type bound, and in the Nagaev-type bound. In our simulation, we set C2 = 2, because Theorem 3.4 provides the equation that the asymptotic covariance Γ satisfies. |