Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization
Authors: Yaoyu Zhang, Leyang Zhang, Zhongwang Zhang, Zhiwei Bai
JMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Figure 3, we conduct experiments to assess the practical significance of the previously estimated optimistic sample sizes in relation to the actual fitting performance of two-layer tanh neural networks (NNs) with varying architectures. ... We verify this experimentally in Figure 4, where we compare the fitting performances of neural networks (NNs) trained with and without dropout. |
| Researcher Affiliation | Academia | Yaoyu Zhanga,b EMAIL Leyang Zhangc leyangz EMAIL Zhongwang Zhanga EMAIL Zhiwei Baia EMAIL a School of Mathematical Sciences, Institute of Natural Sciences and MOE-LSC, Shanghai Jiao Tong University, Shanghai, 200240, China b School of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, 200240, China c School of Mathematics, Georgia Institute of Technology, Atlanta, GA, 30332, United States |
| Pseudocode | No | The paper describes methods and proofs but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | We generate both training and test data sets by sampling input data from a standard normal distribution and computing the output using the target function. ... Both training and test data sets were generated by sampling input data from an equally spaced distribution in the interval [ -15, 14] and computing the corresponding outputs using the target function. |
| Dataset Splits | Yes | The size of the training data set varies whereas the size of the test data set is fixed to 1000. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | Figure 4 caption mentions 'Pytorch default initialization' which implies PyTorch was used, but no specific version number is provided for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For all experiments, network parameters are initialized by a normal distribution with mean 0 and variance 10 20, and trained by full-batch gradient descent with a fine-tuned learning rate. ... The learning rate for the experiments in each setup is fine-tuned from 0.05 to 0.5 for a better generalization performance. ... For all experiments, network parameters are initialized by Pytorch default initialization, and trained by full-batch Adam optimizer with a fine-tuned learning rate. ... In the dropout scenario, neurons were randomly deactivated with a probability of 10% during training. ... The learning rate for the experiments in each setup is fine-tuned from 10 3 to 10 4 for a better generalization performance. |