reproducibilityindex.ai

Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization

Authors: Navid Azizan, Babak Hassibi

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we present an alternative explanation of the behavior of SGD, and more generally, the stochastic mirror descent (SMD) family of algorithms, which includes SGD as a special case. We do so by obtaining a fundamental identity for such algorithms (see Lemmas 2 and 5). Using these identities, we show that for general nonlinear models and general loss functions, when the step size is sufﬁciently small, SMD (and therefore also SGD) is the optimal solution of a certain minimax ﬁltering (or online learning) problem.
Researcher Affiliation	Academia	Navid Azizan California Institute of Technology Pasadena, CA 91125 azizan@caltech.edu Babak Hassibi California Institute of Technology Pasadena, CA 91125 hassibi@caltech.edu
Pseudocode	No	The paper describes algorithms through mathematical equations for updates (e.g., Eq. 3, 13, 15) but does not contain a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any statement about releasing source code for the methodology described, nor does it include any links to code repositories.
Open Datasets	No	The paper defines a 'training dataset' conceptually with notation {(xi, yi) : i = 1, . . . , n} but does not provide concrete access information such as a specific link, DOI, repository name, or formal citation for any publicly available or open dataset.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce data partitioning for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its analysis or illustrative examples.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate any computational aspects.
Experiment Setup	No	The paper discusses 'step size η' and mentions specific values in an illustrative example, but it does not provide comprehensive experimental setup details, such as concrete hyperparameter values, training configurations, or system-level settings, for reproducible experiments.