Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

Authors: Jeffrey Negrea, Mahdi Haghifam, Gintare Karolina Dziugaite, Ashish Khisti, Daniel M. Roy

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental As compared with other bounds that depend on the squared norms of gradients, empirical investigations show that the terms in our bounds are orders of magnitude smaller. In this section, we perform an empirical comparison of the gradient prediction residual of our data dependent priors and the gradient norm across various architectures and datasets.
Researcher Affiliation Collaboration Jeffrey Negrea University of Toronto, Vector Institute Mahdi Haghifam University of Toronto, Element AI Gintare Karolina Dziugaite Element AI Ashish Khisti University of Toronto Daniel M. Roy University of Toronto, Vector Institute
Pseudocode No The paper provides an equation for the SGLD iterates (Eq. 3) but does not include a structured pseudocode block or algorithm listing.
Open Source Code No The paper does not contain an explicit statement or link indicating that the authors' source code for their methodology is publicly available.
Open Datasets Yes [19] Y. Le Cun, C. Cortes, and C. J. C. Burges. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/. 2010. (Also mentions Fashion-MNIST and CIFAR-10, which are standard public datasets used in empirical results section)
Dataset Splits No The paper mentions 'heldout data' in Figure 1a, but it does not provide explicit training, validation, or test split percentages or sample counts, nor does it specify a cross-validation setup.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, or cloud computing instance types).
Software Dependencies No The paper mentions using the Adam optimizer, but it does not provide specific version numbers for any software libraries, frameworks, or environments used in the experiments.
Experiment Setup Yes The details of our model architectures, temperature, learning rate schedules and hyperparameter selections may be found in Appendix H. (Appendix H contains concrete details on model architectures, hyperparameters like learning rates, batch sizes, and temperature schedules.)