reproducibilityindex.ai

Benign Underfitting of Stochastic Gradient Descent

Authors: Tomer Koren, Roi Livni, Yishay Mansour, Uri Sherman

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study to what extent may stochastic gradient descent (SGD) be understood as a conventional learning rule that achieves generalization performance by obtaining a good fit to training data. We consider the fundamental stochastic convex optimization framework, where (one pass, without-replacement) SGD is classically known to minimize the population risk at rate 𝑂(1/ 𝑛), and prove that, surprisingly, there exist problem instances where the SGD solution exhibits both empirical risk and generalization gap of Ω(1).
Researcher Affiliation	Collaboration	Tomer Koren Blavatnik School of Computer Science Tel Aviv University and Google Research tkoren@tauex.tau.ac.il Roi Livni Department of Electrical Engineering Tel Aviv University rlivni@tauex.tau.ac.il Yishay Mansour Blavatnik School of Computer Science Tel Aviv University and Google Research mansour.yishay@gmail.com Uri Sherman Blavatnik School of Computer Science Tel Aviv University urisherman@mail.tau.ac.il
Pseudocode	No	The paper describes the SGD update rule in text rather than providing structured pseudocode or an algorithm block: 'Throughout most of the paper, we consider one-pass projected SGD over 𝑆; initialize at 𝑤1 𝑊; for 𝑡= 2, . . . , 𝑛: 𝑤𝑡+1 Π𝑊(𝑤𝑡 η𝑔𝑡) , with 𝑔𝑡 𝑓(𝑤𝑡; 𝑧𝑡),'
Open Source Code	No	The paper does not provide concrete access to source code. In the checklist, it states: '3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]'
Open Datasets	No	The paper is theoretical and constructs problem instances for analysis; it does not use or provide access information for an empirical dataset for training. The checklist for experimental details confirms this with '[N/A]'.
Dataset Splits	No	The paper is theoretical and does not involve empirical validation splits for a dataset. The checklist for experimental details confirms this with '[N/A]'.
Hardware Specification	No	The paper is theoretical and does not describe the hardware used for experiments. The checklist explicitly states: '3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]'
Software Dependencies	No	The paper is theoretical and does not describe software dependencies with specific version numbers. The checklist for experimental details confirms this with '[N/A]'.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with specific details like hyperparameters or training configurations. The checklist for experimental details confirms this with '[N/A]'.