reproducibilityindex.ai

Generalization Bounds for Stochastic Gradient Descent via Localized $\varepsilon$-Covers

Authors: Sejun Park, Umut Simsekli, Murat A. Erdogdu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we propose a new covering technique localized for the trajectories of SGD. This localization provides an algorithm-specific complexity measured by the covering number, which can have dimension-independent cardinality in contrast to standard uniform covering arguments that result in exponential dimension dependency. Based on this localized construction, we show that if the objective function is a finite perturbation of a piecewise strongly convex and smooth function with P pieces, i.e. non-convex and non-smooth in general, the generalization error can be upper bounded by O( p(log n log(n P))/n), where n is the number of data samples. In particular, this rate is independent of dimension and does not require early stopping and decaying step size. Finally, we employ these results in various contexts and derive generalization bounds for multi-index linear models, multi-class support vector machines, and K-means clustering for both hard and soft label setups, improving the known state-of-the-art rates.
Researcher Affiliation	Academia	Sejun Park Korea University sejun.park000@gmail.com Umut Sim sekli DI ENS, Ecole Normale Supérieure, Université PSL, CNRS, INRIA umut.simsekli@inria.fr Murat A. Erdogdu University of Toronto & Vector Institute erdogdu@cs.toronto.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	The paper focuses on theoretical analysis and does not involve empirical studies with specific datasets. The ethics statement indicates N/A for experiments.
Dataset Splits	No	The paper does not provide specific dataset split information as it is a theoretical work.
Hardware Specification	No	The paper is theoretical and does not involve running experiments on specific hardware. The ethics statement indicates N/A for experiments.
Software Dependencies	No	The paper is theoretical and does not mention any software dependencies with version numbers. The ethics statement indicates N/A for experiments.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings. The ethics statement indicates N/A for experiments.