Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent
Authors: Liu Ziyin, Mingze Wang, Hongchao Li, Lei Wu
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our theory to understand specific problems and present numerical results in Section 5. All the proofs are presented in the Appendix. |
| Researcher Affiliation | Collaboration | Liu Ziyin Massachusetts Institute of Technology, NTT Research EMAIL Mingze Wang Peking University EMAIL Hongchao Li The University of Tokyo EMAIL Lei Wu Peking University EMAIL |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | No | Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: The code or data of the experiments are simple and easy to reproduce following the description in the main text. |
| Open Datasets | Yes | Here, we give the details for the experiment in Figure 2. We train a two-layer linear net with d0 = d2 = 30 and d = 40. The input data is x N(0,1), and y = x+ϵ, where ϵ is i.i.d. Gaussian with unit variance. |
| Dataset Splits | No | The paper mentions training and testing phases but does not explicitly provide details about training/validation/test dataset splits, such as percentages or sample counts for a validation set. |
| Hardware Specification | No | For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [No] Justification: The experiments can be simply conducted on personal computers. |
| Software Dependencies | No | The paper does not explicitly list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | Here, we give the details for the experiment in Figure 2. We train a two-layer linear net with d0 = d2 = 30 and d = 40. The input data is x N(0,1), and y = x+ϵ, where ϵ is i.i.d. Gaussian with unit variance. (Section A.2), when the learning rate (η = 0.008) is too large, SGD diverges (orange line). However, when one starts training at a small learning rate (0.001) and increases η to 0.008 after 5000 iterations, the training remains stable. (Figure 4 caption), Unless it is the independent variable, η, S and d are set to be 0.1, 100 and 2000, respectively. (Figure 8 caption). |