Time-independent Generalization Bounds for SGLD in Non-convex Settings

Authors: Tyler Farghly, Patrick Rebeschini

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/optimization literature. Unlike existing bounds for SGLD in non-convex settings, ours are time-independent and decay to zero as the sample size increases. Using the framework of uniform stability, we establish time-independent bounds by exploiting the Wasserstein contraction property of the Langevin diffusion, which also allows us to circumvent the need to bound gradients using Lipschitz-like assumptions. Our analysis also supports variants of SGLD that use different discretization methods, incorporate Euclidean projections, or use non-isotropic noise.
Researcher Affiliation Academia Tyler Farghly Department of Statistics University of Oxford farghly@stats.ox.ac.uk Patrick Rebeschini Department of Statistics University of Oxford patrick.rebeschini@stats.ox.ac.uk
Pseudocode No The paper defines algorithms using mathematical equations (e.g., equation 1) but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not conduct experiments on a specific dataset. Therefore, no information about public dataset availability for training is provided.
Dataset Splits No The paper is theoretical and does not conduct experiments involving dataset splits for validation.
Hardware Specification No The paper is theoretical and does not describe any experiments that would require specific hardware. No hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe any experiments or implementations that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and focuses on mathematical bounds and analyses, not on empirical experimental setups with hyperparameters or training configurations.