Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Nesterov Accelerated Shuffling Gradient Method for Convex Optimization
Authors: Trang H Tran, Katya Scheinberg, Lam M Nguyen
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical simulations demonstrate the efficiency of our algorithm. |
| Researcher Affiliation | Collaboration | 1School of Operations Research and Information Engineering, Cornell University, Ithaca, NY, USA. 2IBM Research, Thomas J. Watson Research Center, Yorktown Heights, NY, USA. |
| Pseudocode | Yes | Algorithm 2 Nesterov Accelerated Shuffling Gradient (NASG) Method |
| Open Source Code | Yes | Our code can be found at the repository https://github. com/htt-trangtran/nasg. |
| Open Datasets | Yes | We have conducted the experiments on three classification datasets w8a (49, 749 samples), ijcnn1 (91, 701 samples) and covtype (406709 samples) from LIBSVM (Chang & Lin, 2011). ... We test our algorithm using linear neural networks on three well-known image classification datasets: MNIST dataset (Le Cun et al., 1998) and Fashion-MNIST dataset (Xiao et al., 2017) both with 60, 000 samples, and finally CIFAR-10 dataset (Krizhevsky & Hinton, 2009) with 50, 000 images. |
| Dataset Splits | Yes | At the tuning stage, we test each method for 20 epochs. We run every algorithm with a constant learning rate where the learning rates follows a grid search and select the ones that perform best according to their results. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, or memory) used to run its experiments. |
| Software Dependencies | Yes | All the algorithms are implemented in Python using Py Torch package (Paszke et al., 2019). |
| Experiment Setup | Yes | The minibatch size is 256. ... We tune each algorithm using constant learning rate and report the best final results. ... For SGD and NASG the searching grid is {1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001}. ... For SGD-M, ... Note that this momentum update is implemented in Py Torch with the default value β = 0.9. ... For Adam, we fixed two hyper-parameters β1 := 0.9, β2 := 0.999 as in the original paper. Since the default learning rate for Adam is 0.001, we let our searching grid be {0.005, 0.001, 0.0005}. |