Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Global curvature for second-order optimization of neural networks

Authors: Alberto Bernacchia

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the practical implications of our framework, we apply second-order optimization to synthetic data, achieving markedly faster convergence compared to traditional optimization methods.
Researcher Affiliation	Industry	1Media Tek Research, Cambridge, UK. Correspondence to: Alberto Bernacchia <EMAIL>.
Pseudocode	Yes	A detailed description of the complete procedure is provided in Algorithm 1 in the Appendix, using the simple case of a two-layer MLP with Tanh activation and no bias.
Open Source Code	Yes	Code: github.com/mtkresearch/symo notebooks
Open Datasets	No	The synthetic dataset consists of 5000 training and 5000 testing data points, where the input is sampled from a Gaussian distribution with zero mean. The covariance matrix of the input is generated using random orthogonal eigenvectors (Mezzadri, 2007), and the eigenvalues are set on a logarithmic grid between 10 5 and 100.
Dataset Splits	Yes	The synthetic dataset consists of 5000 training and 5000 testing data points, where the input is sampled from a Gaussian distribution with zero mean.
Hardware Specification	No	These are matrix-matrix products of size equal to the neural network width, that can be computed efficiently using a GPU.
Software Dependencies	No	In Pytorch for example, Assumption 2.1 holds for nn.init.normal and nn.init.orthogonal...
Experiment Setup	Yes	For all optimizers, learning rate is set by a grid search. For second-order optimizers, we additionally set a second hyperparameter by grid search: damping λ for KFAC, initialization ϵ for Shampoo and decay parameter β for Sym O.