Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions
Authors: Anant Raj, Lingjiong Zhu, Mert Gurbuzbalaban, Umut Simsekli
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we build on this line of research and develop generalization bounds for a more general class of objective functions, which includes non-convex functions as well. Our approach is based on developing Wasserstein stability bounds for heavytailed SDEs and their discretizations, which we then convert to generalization bounds. |
| Researcher Affiliation | Academia | 1Coordinated Science Laboraotry, University of Illinois Urbana-Champaign, IL, USA 2Inria, Ecole Normale Sup erieure, PSL Research University, Paris, France 3Department of Mathematics, Florida State University, FL, USA 4Department of Management Science and Information Systems, Rutgers University, NJ, USA 5Princeton University, NJ, USA. |
| Pseudocode | No | The paper does not contain pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide an unambiguous statement or link to its open-source code. |
| Open Datasets | No | The paper is theoretical and discusses a 'training dataset Xn' in a conceptual context, but does not provide details or access information for a specific publicly available dataset used in experiments. |
| Dataset Splits | No | The paper is theoretical and does not specify training, test, or validation dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not list specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not provide details about an experimental setup, such as hyperparameters or training settings. |