Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel

Authors: Yilan Chen, Zhichao Wang, Wei Huang, Andi Han, Taiji Suzuki, Arya Mazumdar

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments on real-world datasets validate that our bounds correlate well with the true generalization gap. [...] We conduct comprehensive numerical experiments to demonstrate that our generalization bounds correlate well with the true generalization gap. For more simulations and details, see the Appendix.
Researcher Affiliation	Academia	1 University of California San Diego 2 University of California Berkeley 3 International Computer Science Institute 4 RIKEN AIP 5 The Institute of Statistical Mathematics 6 The University of Sydney 7 The University of Tokyo
Pseudocode	Yes	Algorithm 1 Gradient Flow Require: N0, ρ, T0, T1, N, and λ. Initialize θ(0) Unif(Sd 1), c(0) Unif( c RN : c 2 = ρ, c 0 = N0 ). Run gradient flow (6.3) up to time T = T0 + T1. Return θ(T), c(T).
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the methodology, nor does it provide a direct link to a code repository.
Open Datasets	Yes	In Fig. 2, we use logistic loss to train a two-layer NN of 400 hidden nodes and Softplus activation function for binary classification on 4000 CIFAR-10 cat and dog [35] data by full-batch gradient descent and compute Γ, the main term in our bound. [...] In Fig. 1, we train a randomly initialized Res Net 18 by SGD on full CIFAR-10 [35] and estimate Γ in our bound. [...] Fig. 6, we train a two-layer NN of 1000 hidden nodes with a learning rate of 0.1 and batch size 200 for 10 epochs. (II). Two-layer NN trained by SGD on full MNIST.
Dataset Splits	No	The paper mentions using "full CIFAR-10" and "full MNIST" and shows "test loss, and test error" in figures, implying the use of test sets. However, it does not explicitly provide the specific training/test/validation split percentages or methodology used for these datasets.
Hardware Specification	Yes	Experiments are implemented with Py Torch [56] on 24G A5000 and V100 GPUs.
Software Dependencies	No	Experiments are implemented with Py Torch [56]. The paper does not provide a specific version number for PyTorch or any other software dependency.
Experiment Setup	Yes	In experiment (I), we train the two-layer NN with a learning rate of η = 0.01 for 8000 steps. The training time is calculated by T = η steps. [...] For experiment (II) and Fig. 4, we train Resnet 18 and Resnet 34 with a learning rate of 0.001 and batch size of 128 for 50 epochs. For Fig. 5, we train a two-layer NN of 1000 hidden nodes with a learning rate of 0.01 and batch size 128 for 100 epochs. For Fig. 6, we train a two-layer NN of 1000 hidden nodes with a learning rate of 0.1 and batch size 200 for 10 epochs.