SGD without Replacement: Sharper Rates for General Smooth Convex Functions

Authors: Dheeraj Nagaraj, Prateek Jain, Praneeth Netrapalli

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical By using method of exchangeable pairs to bound Wasserstein distance, we provide the first non-asymptotic results for SGDo when applied to general smooth, stronglyconvex functions. In particular, we show that SGDo converges at a rate of O(1/K2) while SGD is known to converge at O(1/K) rate, where K denotes the number of passes over data and is required to be large enough. Existing results for SGDo in this setting require additional Hessian Lipschitz assumption (G urb uzbalaban et al., 2015; Hao Chen & Sra, 2018).
Researcher Affiliation Collaboration Dheeraj Nagaraj 1 Praneeth Netrapalli 2 Prateek Jain 2 1Massachusetts Institute of Technology, Cambridge, Massachusetts, USA 2Microsoft Research, Bengaluru, Karnataka, India.
Pseudocode Yes Algorithm 1 SGD: SGD with replacement... Algorithm 2 SGDo: SGD without replacement
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing code for the work described in this paper, nor does it provide a direct link to a source-code repository.
Open Datasets No The paper is theoretical and does not involve experiments on datasets.
Dataset Splits No The paper is theoretical and does not involve experiments, thus no dataset split information is provided.
Hardware Specification No The paper is theoretical and does not describe experiments that would require specific hardware. No hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe software implementation or dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings.