Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Nesterov acceleration despite very noisy gradients
Authors: Kanan Gupta, Jonathan W. Siegel, Stephan Wojtowytsch
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Numerical Experiments |
| Researcher Affiliation | Academia | Kanan Gupta Department of Mathematics University of Pittsburgh EMAIL Jonathan W. Siegel Department of Mathematics Texas A&M University EMAIL Stephan Wojtowytsch Department of Mathematics University of Pittsburgh EMAIL |
| Pseudocode | Yes | Algorithm 1: Accelerated Gradient descent with Noisy EStimators (AGNES) |
| Open Source Code | Yes | All the code used for the experiments in the paper has been provided in the supplementary materials. |
| Open Datasets | Yes | We trained Res Net-34 [He et al., 2016]... on the CIFAR-10 image dataset [Krizhevsky et al., 2009]... We tried various combinations of AGNES hyperparameters α and η to train Le Net-5 on the MNIST dataset |
| Dataset Splits | No | The resulting dataset was split into 90% training and 10% testing data. The paper specifies training and testing splits but does not mention a separate validation split. |
| Hardware Specification | No | The experiments in sections 5.3 and 5.4 were run on a single current generation GPU in a local cluster for up to 50 hours. This work used the H2P cluster, which is supported by NSF award number OAC-2117681. While it mentions "single current generation GPU" and "H2P cluster", it does not specify exact GPU/CPU models or detailed specifications. |
| Software Dependencies | No | All neural-network based experiments were performed using the Py Torch library. The paper mentions PyTorch but does not specify a version number. |
| Experiment Setup | Yes | We selected the learning rate 10 3 for Adam... For AGNES, NAG, and SGD, based on initial exploratory experiments, we used a learning rate of 10 4, a momentum value of 0.99, and for AGNES, a correction step size η = 10 3. We used the same initial learning rate 10 3 for all the algorithms, which was dropped to 10 4 after 25 epochs. A momentum value of 0.99 was used for SGD, NAG, and AGNES and a constant correction step size η = 10 2 was used for AGNES. |