reproducibilityindex.ai

SGD with shuffling: optimal rates without component convexity and large epoch requirements

Authors: Kwangjun Ahn, Chulhee Yun, Suvrit Sra

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study without-replacement SGD for solving ﬁnite-sum optimization problems. Speciﬁcally, depending on how the indices of the ﬁnite-sum are shufﬂed, we consider the RANDOMSHUFFLE (shufﬂe at the beginning of each epoch) and SINGLESHUFFLE (shufﬂe only once) algorithms. First, we establish minimax optimal convergence rates of these algorithms up to poly-log factors. Notably, our analysis is general enough to cover gradient dominated nonconvex costs, and does not rely on the convexity of individual component functions unlike existing optimal convergence results.
Researcher Affiliation	Academia	Kwangjun Ahn* Department of EECS MIT kjahn@mit.edu Chulhee Yun* Department of EECS MIT chulheey@mit.edu Suvrit Sra Department of EECS MIT suvrit@mit.edu
Pseudocode	No	The paper describes algorithms but does not provide structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that open-source code for the described methodology is available.
Open Datasets	No	This is a theoretical paper focused on mathematical analysis and proofs of convergence rates. It does not conduct empirical studies, use datasets, or perform training.
Dataset Splits	No	This is a theoretical paper focused on mathematical analysis and proofs of convergence rates. It does not involve dataset splits (training, validation, test) for empirical evaluation.
Hardware Specification	No	This is a theoretical paper that does not report on empirical experiments. Therefore, no hardware specifications are mentioned.
Software Dependencies	No	This is a theoretical paper that does not report on empirical experiments. Therefore, no specific software dependencies with version numbers are mentioned.
Experiment Setup	No	This is a theoretical paper focused on mathematical analysis. It does not include an experimental setup with hyperparameters or system-level training settings.