Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SGD with shuffling: optimal rates without component convexity and large epoch requirements

Authors: Kwangjun Ahn, Chulhee Yun, Suvrit Sra

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study without-replacement SGD for solving finite-sum optimization problems. Specifically, depending on how the indices of the finite-sum are shuffled, we consider the RANDOMSHUFFLE (shuffle at the beginning of each epoch) and SINGLESHUFFLE (shuffle only once) algorithms. First, we establish minimax optimal convergence rates of these algorithms up to poly-log factors. Notably, our analysis is general enough to cover gradient dominated nonconvex costs, and does not rely on the convexity of individual component functions unlike existing optimal convergence results.
Researcher Affiliation Academia Kwangjun Ahn* Department of EECS MIT EMAIL Chulhee Yun* Department of EECS MIT EMAIL Suvrit Sra Department of EECS MIT EMAIL
Pseudocode No The paper describes algorithms but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that open-source code for the described methodology is available.
Open Datasets No This is a theoretical paper focused on mathematical analysis and proofs of convergence rates. It does not conduct empirical studies, use datasets, or perform training.
Dataset Splits No This is a theoretical paper focused on mathematical analysis and proofs of convergence rates. It does not involve dataset splits (training, validation, test) for empirical evaluation.
Hardware Specification No This is a theoretical paper that does not report on empirical experiments. Therefore, no hardware specifications are mentioned.
Software Dependencies No This is a theoretical paper that does not report on empirical experiments. Therefore, no specific software dependencies with version numbers are mentioned.
Experiment Setup No This is a theoretical paper focused on mathematical analysis. It does not include an experimental setup with hyperparameters or system-level training settings.