Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization

Authors: Yaoyu Zhang, Leyang Zhang, Zhongwang Zhang, Zhiwei Bai

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Figure 3, we conduct experiments to assess the practical significance of the previously estimated optimistic sample sizes in relation to the actual fitting performance of two-layer tanh neural networks (NNs) with varying architectures. ... We verify this experimentally in Figure 4, where we compare the fitting performances of neural networks (NNs) trained with and without dropout.
Researcher Affiliation Academia Yaoyu Zhanga,b EMAIL Leyang Zhangc leyangz EMAIL Zhongwang Zhanga EMAIL Zhiwei Baia EMAIL a School of Mathematical Sciences, Institute of Natural Sciences and MOE-LSC, Shanghai Jiao Tong University, Shanghai, 200240, China b School of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, 200240, China c School of Mathematics, Georgia Institute of Technology, Atlanta, GA, 30332, United States
Pseudocode No The paper describes methods and proofs but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets No We generate both training and test data sets by sampling input data from a standard normal distribution and computing the output using the target function. ... Both training and test data sets were generated by sampling input data from an equally spaced distribution in the interval [ -15, 14] and computing the corresponding outputs using the target function.
Dataset Splits Yes The size of the training data set varies whereas the size of the test data set is fixed to 1000.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No Figure 4 caption mentions 'Pytorch default initialization' which implies PyTorch was used, but no specific version number is provided for PyTorch or any other software dependencies.
Experiment Setup Yes For all experiments, network parameters are initialized by a normal distribution with mean 0 and variance 10 20, and trained by full-batch gradient descent with a fine-tuned learning rate. ... The learning rate for the experiments in each setup is fine-tuned from 0.05 to 0.5 for a better generalization performance. ... For all experiments, network parameters are initialized by Pytorch default initialization, and trained by full-batch Adam optimizer with a fine-tuned learning rate. ... In the dropout scenario, neurons were randomly deactivated with a probability of 10% during training. ... The learning rate for the experiments in each setup is fine-tuned from 10 3 to 10 4 for a better generalization performance.