Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond
Authors: Jaeyoung Cha, Jaewook Lee, Chulhee Yun
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study convergence lower bounds of withoutreplacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components n and the number of epochs K, we seek bounds for arbitrary weighted average iterates that are tight in all factors including the condition number κ. For SGD with Random Reshuffling, we present lower bounds that have tighter κ dependencies than existing bounds. Our results are the first to perfectly close the gap between lower and upper bounds for weighted average iterates in both strongly-convex and convex cases. We also prove weighted average iterate lower bounds for arbitrary permutation-based SGD, which apply to all variants that carefully choose the best permutation. |
| Researcher Affiliation | Academia | 1Kim Jaechul Graduate School of AI, KAIST, Seoul, South Korea. Correspondence to: Chulhee Yun <chulhee.yun@kaist.ac.kr>. |
| Pseudocode | Yes | Algorithm 1 Offline Gra B (Lu et al., 2022a) |
| Open Source Code | No | The paper does not provide any statement about releasing its own source code or a link to a repository for the theoretical methodology developed in this paper. |
| Open Datasets | No | The paper is theoretical and does not involve empirical experiments using datasets. Therefore, no information on public availability of a dataset is provided. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments using datasets, thus no dataset split information for validation is provided. |
| Hardware Specification | No | The paper is purely theoretical and focuses on mathematical proofs. It does not describe any experimental setup or the hardware used for computations. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical proofs. It does not mention specific software dependencies with version numbers for experimental reproducibility. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical proofs. It does not provide details about an experimental setup or hyperparameters. |