Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Provably Improves the Convergence of Gradient Descent

Authors: Qingyu Song, Wei Lin, Hong Xu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate our theoretical findings through comprehensive experiments. The results showcase significant performance advantages, including up to a 50% improvement in solution optimality over the standard GD algorithm post-training, and superior robustness compared to SOTA L2O models and the Adam optimizer [10].
Researcher Affiliation	Academia	Qingyu Song Xiamen University EMAIL Wei Lin, Hong Xu The Chinese University of Hong Kong EMAIL, EMAIL
Pseudocode	No	The paper includes a computational graph (Figure 1) detailing the Math-L2O forward and backward operations, and various mathematical formulations for derivatives, but does not present a distinct pseudocode or algorithm block with structured steps.
Open Source Code	Yes	The code of our method can be found from https://github.com/Net X-lab/Math L2OProof-Official.
Open Datasets	Yes	Utilizing a compact Convolutional Neural Network (CNN) on the MNIST dataset, our method achieved significantly faster convergence, thereby corroborating our theoretical findings.
Dataset Splits	No	For the synthetic data, the paper states: "vectors X R5120 1 and Y R4000 1 for Equation (2) are generated by sampling from a standard Gaussian distribution." For MNIST: "The optimization objective is the total cross-entropy loss over 200 randomly selected MNIST samples." While samples are generated or selected, explicit train/validation/test splits are not provided.
Hardware Specification	Yes	Experiments are conducted using Python 3.9 and Py Torch 1.12.0 on an Ubuntu 20.04 system equipped with 128GB of RAM and two NVIDIA RTX 3090 GPUs.
Software Dependencies	Yes	Experiments are conducted using Python 3.9 and Py Torch 1.12.0 on an Ubuntu 20.04 system equipped with 128GB of RAM and two NVIDIA RTX 3090 GPUs.
Experiment Setup	Yes	The Math-L2O model is configured with T = 100 optimization steps (Equation (2)). Its architecture comprises a L = 3-layer DNN, as formulated in Equation (4). The first layer has an output dimension of 2. To ensure over-parameterization, the (L 1)-th (i.e., second) layer s output dimension is set to 512 10 = 5120. The final layer produces a scalar output (dimension 1). Three specific model configurations are designed for ablation studies, foundational experiments, and robustness evaluations. These are detailed in Appendix C.1. L2O models are trained using the Stochastic Gradient Descent (SGD) optimizer.