Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Authors: Raef Bassily, Vitaly Feldman, Cristóbal Guzmán, Kunal Talwar
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our work is the first to address uniform stability of SGD on nonsmooth convex losses. Specifically, we provide sharp upper and lower bounds for several forms of SGD and full-batch GD on arbitrary Lipschitz nonsmooth convex losses. Our lower bounds show that, in the nonsmooth case, (S)GD can be inherently less stable than in the smooth case. On the other hand, our upper bounds show that (S)GD is sufficiently stable for deriving new and useful bounds on generalization error. Most notably, we obtain the first dimension-independent generalization bounds for multi-pass SGD in the nonsmooth case. |
| Researcher Affiliation | Collaboration | Raef Bassily Department of Computer Science & Engineering The Ohio State University. EMAIL Vitaly Feldman Apple Cristóbal Guzmán Pontificia Universidad Católica de Chile Institute for Mathematical and Computational Engineering ANID Millennium Science Initiative Program Millennium Nucleus Center for the Discovery of Structures in Complex Data EMAIL Kunal Talwar Apple EMAIL |
| Pseudocode | Yes | Algorithm 1 Ar SGD: Sampling with replacement SGD |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and uses the concept of a 'dataset' in a general sense (S = (z1, ..., zn)) for theoretical analysis, but does not mention or provide access information for any specific publicly available datasets for training purposes. |
| Dataset Splits | No | The paper is theoretical and does not describe practical experiment setups, including data splits for training, validation, or testing. |
| Hardware Specification | No | This is a theoretical paper providing mathematical analyses and does not describe computational experiments or specify any hardware used. |
| Software Dependencies | No | This is a theoretical paper and does not mention any specific software dependencies or versions required for implementation or reproduction. |
| Experiment Setup | No | This is a theoretical paper presenting algorithms and proofs, and therefore does not include details on experimental setup such as hyperparameters or training configurations. |