Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization

Authors: Yaoyu Zhang, Leyang Zhang, Zhongwang Zhang, Zhiwei Bai

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Figure 3, we conduct experiments to assess the practical signiﬁcance of the previously estimated optimistic sample sizes in relation to the actual ﬁtting performance of two-layer tanh neural networks (NNs) with varying architectures. ... We verify this experimentally in Figure 4, where we compare the ﬁtting performances of neural networks (NNs) trained with and without dropout.
Researcher Affiliation	Academia	Yaoyu Zhanga,b EMAIL Leyang Zhangc leyangz EMAIL Zhongwang Zhanga EMAIL Zhiwei Baia EMAIL a School of Mathematical Sciences, Institute of Natural Sciences and MOE-LSC, Shanghai Jiao Tong University, Shanghai, 200240, China b School of Artiﬁcial Intelligence, Shanghai Jiao Tong University, Shanghai, 200240, China c School of Mathematics, Georgia Institute of Technology, Atlanta, GA, 30332, United States
Pseudocode	No	The paper describes methods and proofs but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	No	We generate both training and test data sets by sampling input data from a standard normal distribution and computing the output using the target function. ... Both training and test data sets were generated by sampling input data from an equally spaced distribution in the interval [ -15, 14] and computing the corresponding outputs using the target function.
Dataset Splits	Yes	The size of the training data set varies whereas the size of the test data set is ﬁxed to 1000.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	Figure 4 caption mentions 'Pytorch default initialization' which implies PyTorch was used, but no specific version number is provided for PyTorch or any other software dependencies.
Experiment Setup	Yes	For all experiments, network parameters are initialized by a normal distribution with mean 0 and variance 10 20, and trained by full-batch gradient descent with a ﬁne-tuned learning rate. ... The learning rate for the experiments in each setup is ﬁne-tuned from 0.05 to 0.5 for a better generalization performance. ... For all experiments, network parameters are initialized by Pytorch default initialization, and trained by full-batch Adam optimizer with a ﬁne-tuned learning rate. ... In the dropout scenario, neurons were randomly deactivated with a probability of 10% during training. ... The learning rate for the experiments in each setup is ﬁne-tuned from 10 3 to 10 4 for a better generalization performance.