Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Variational Learning Finds Flatter Solutions at the Edge of Stability

Authors: Avrajit Ghosh, Bai Cong, Rio Yokota, Saiprasad Ravishankar, Rongrong Wang, Molei Tao, Mohammad Emtiyaz Khan, Thomas Möllenhoff

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate these findings on a wide variety of large networks, such as Res Net and Vi T, to find that the theoretical results closely match the empirical ones. ... Section 4: Experiments on Deep Neural Networks
Researcher Affiliation	Academia	1Michigan State University 2Institute of Science Tokyo 3RIKEN Center for AI Project 4Georgia Institute of Technology
Pseudocode	No	The paper refers to an algorithm from a cited work ('Shen et al. [2024] called IVON (Improved Variational Online Newton)...Algorithm 1, line 8') but does not contain a pseudocode or algorithm block within its own text.
Open Source Code	Yes	Code to replicate these results is available at https://github.com/Avra98/variationallearning-eos.
Open Datasets	Yes	We empirically validate these findings on a wide variety of large networks, such as Res Net and Vi T, ... on CIFAR-10 for an MLP where VL achieves lower sharpness than GD... For example, on Image Net, VL substantially improves overfitting... on the CIFAR-10 classification task... We train an MLP on CIFAR-10... SVHN dataset Netzer et al. [2011]... Fashion MNIST dataset... finetune the head on SST-2.
Dataset Splits	Yes	Across all three architectures on the CIFAR-10 classification task using MSE loss... train an MLP on CIFAR-10... SVHN dataset... Fashion MNIST dataset... MLP network is trained on a subset of the CIFAR-10 dataset containing 10000 images... finetune the head on SST-2.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For three different learning rates, ρ = 0.05, 0.1, and 0.2, we run VGD with different σ2. Across all learning rates ρ and numbers of posterior samples Ns, we observe that larger variance consistently leads to lower sharpness. ... For both GD and Variational GD, we use the same learning rate of 0.05.