Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
Authors: Scott Pesme, Aymeric Dieuleveut, Nicolas Flammarion
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our theoretical results with synthetic and real examples. We provide additional experiments in Appendix A.2. |
| Researcher Affiliation | Academia | Scott Pesme 1 Aymeric Dieuleveut 2 Nicolas Flammarion 1 1 Theory of Machine Learning lab, EPFL 2 Ecole Polytechnique. Correspondence to: Scott Pesme <scott.pesme@epfl.ch>. |
| Pseudocode | Yes | Algorithm 1 Convergence-Diagnostic algorithm |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-sourcing of their code for the described methodology. |
| Open Datasets | Yes | Res Net18. We train an 18-layer Res Net model (He et al., 2016) on the CIFAR-10 dataset (Krizhevsky, 2009) using SGD with a momentum of 0.9, weight decay of 0.0001 and batch size of 128. ... We further investigate the performance of the distance-based diagnostic on real-world datasets: the Covertype dataset and the MNIST dataset1. (Footnote 1: Covertype dataset available at archive.ics.uci.edu/ml/datasets/covertype and MNIST at yann.lecun.com/exdb/mnist.) |
| Dataset Splits | No | Each dataset is divided in two equal parts, one for training and one for testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Pytorch s Reduce LROn Plateau() scheduler' but does not specify version numbers for PyTorch or any other software components. |
| Experiment Setup | Yes | Res Net18. We train an 18-layer Res Net model (He et al., 2016) on the CIFAR-10 dataset (Krizhevsky, 2009) using SGD with a momentum of 0.9, weight decay of 0.0001 and batch size of 128. To adapt the distance-based step-size statistic to this scenario, we use Pytorch s Reduce LROn Plateau() scheduler... The parameters of the scheduler are set to: patience = 1000, threshold = 0.01... All initial step sizes are set to 0.1... The initial step size for our distance-based algorithm was set to 4/R2. |