reproducibilityindex.ai

On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems

Authors: Panayotis Mertikopoulos, Nadav Hallak, Ali Kavis, Volkan Cevher

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We explore these properties in a range of standard non-convex test functions and by training a Res Net architecture for a classiﬁcation task over CIFAR. 5 Numerical experiments As an illustration of our theoretical analysis, we plot in Fig. 1a the convergence rate of (SGD) in the standard Shekel risk benchmark function... We demonstrate the beneﬁts of this cooldown heuristic in a standard Res Net18 architecture for a classiﬁcation task over CIFAR10.
Researcher Affiliation	Collaboration	Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG & Criteo AI Lab panayotis.mertikopoulos@imag.fr Nadav Hallak Technion ndvhllk@technion.ac.il Ali Kavis École Polytechnique Fédérale de Lausanne (EPFL) ali.kavis@epfl.ch Volkan Cevher École Polytechnique Fédérale de Lausanne (EPFL) volkan.cevher@epfl.ch
Pseudocode	No	The paper presents the SGD algorithm as a mathematical formula: Xn+1 = Xn γn Vn. (SGD) but does not include structured pseudocode or an algorithm block.
Open Source Code	No	The paper does not contain any statements about providing open-source code for the described methodology, nor does it include links to a code repository.
Open Datasets	Yes	We explore these properties in a range of standard non-convex test functions and by training a Res Net architecture for a classiﬁcation task over CIFAR. We demonstrate the beneﬁts of this cooldown heuristic in a standard Res Net18 architecture for a classiﬁcation task over CIFAR10.
Dataset Splits	No	The paper mentions using CIFAR10 and training a Res Net architecture, but it does not provide specific details about the training, validation, or test dataset splits (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used to run its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or their version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experiment environment.
Experiment Setup	Yes	For our experiments, we ran N = 103 instances of (SGD) with a constant, 1/ n, and 1/n step-size schedule... we ran (SGD) with a constant step-size for 100 epochs, with checkpoints at different cutoffs; then, at each checkpoint, we launched the cooldown period with step-size 1/n.