The Sample Complexity of Gradient Descent in Stochastic Convex Optimization

Authors: Roi Livni

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We analyze the sample complexity of full-batch Gradient Descent (GD) in the setup of non-smooth Stochastic Convex Optimization. We show that the generalization error of GD... Our bound follows from a new generalization bound... This resolves an open problem... The proof is an immediate corollary of Lemmas 8 and 9, which we prove in Appendices A.1 and A.2 respectively.
Researcher Affiliation Academia Roi Livni School of Electrical Engineering Tel Aviv University rlivni@tauex.tau.ac.il
Pseudocode No GD depends on hyperparameters 𝑇 β„•and Ξ· 0 and operates as follows on the empirical risk. The algorithm receives as input a sample 𝑆= {𝑧1, . . . , π‘§π‘š}, defines 𝑀0 = 0, and operates for 𝑇iterations according to the following recursion: 𝑀𝑑+1 = Ξ  𝑀𝑑 Ξ·/π‘š π‘ƒπ‘š 𝑑=1 𝑀𝑑, (4)
Open Source Code No No mention of open-source code for the methodology or any code release.
Open Datasets No No specific datasets, public or otherwise, are mentioned for empirical evaluation. The paper works with abstract samples drawn from a distribution.
Dataset Splits No No empirical validation process or dataset splits are described.
Hardware Specification No No hardware specifications are mentioned as no experiments were conducted.
Software Dependencies No No software dependencies are mentioned as no experiments were conducted.
Experiment Setup No No experimental setup details, hyperparameters, or training configurations are described, as no experiments were conducted.