reproducibilityindex.ai

Nesterov acceleration despite very noisy gradients

Authors: Kanan Gupta, Jonathan W. Siegel, Stephan Wojtowytsch

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Numerical Experiments
Researcher Affiliation	Academia	Kanan Gupta Department of Mathematics University of Pittsburgh kanan.g@pitt.edu Jonathan W. Siegel Department of Mathematics Texas A&M University jwsiegel@tamu.edu Stephan Wojtowytsch Department of Mathematics University of Pittsburgh s.woj@pitt.edu
Pseudocode	Yes	Algorithm 1: Accelerated Gradient descent with Noisy EStimators (AGNES)
Open Source Code	Yes	All the code used for the experiments in the paper has been provided in the supplementary materials.
Open Datasets	Yes	We trained Res Net-34 [He et al., 2016]... on the CIFAR-10 image dataset [Krizhevsky et al., 2009]... We tried various combinations of AGNES hyperparameters α and η to train Le Net-5 on the MNIST dataset
Dataset Splits	No	The resulting dataset was split into 90% training and 10% testing data. The paper specifies training and testing splits but does not mention a separate validation split.
Hardware Specification	No	The experiments in sections 5.3 and 5.4 were run on a single current generation GPU in a local cluster for up to 50 hours. This work used the H2P cluster, which is supported by NSF award number OAC-2117681. While it mentions "single current generation GPU" and "H2P cluster", it does not specify exact GPU/CPU models or detailed specifications.
Software Dependencies	No	All neural-network based experiments were performed using the Py Torch library. The paper mentions PyTorch but does not specify a version number.
Experiment Setup	Yes	We selected the learning rate 10 3 for Adam... For AGNES, NAG, and SGD, based on initial exploratory experiments, we used a learning rate of 10 4, a momentum value of 0.99, and for AGNES, a correction step size η = 10 3. We used the same initial learning rate 10 3 for all the algorithms, which was dropped to 10 4 after 25 epochs. A momentum value of 0.99 was used for SGD, NAG, and AGNES and a constant correction step size η = 10 2 was used for AGNES.