Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Authors: Jeffrey Negrea, Mahdi Haghifam, Gintare Karolina Dziugaite, Ashish Khisti, Daniel M. Roy
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As compared with other bounds that depend on the squared norms of gradients, empirical investigations show that the terms in our bounds are orders of magnitude smaller. In this section, we perform an empirical comparison of the gradient prediction residual of our data dependent priors and the gradient norm across various architectures and datasets. |
| Researcher Affiliation | Collaboration | Jeffrey Negrea University of Toronto, Vector Institute Mahdi Haghifam University of Toronto, Element AI Gintare Karolina Dziugaite Element AI Ashish Khisti University of Toronto Daniel M. Roy University of Toronto, Vector Institute |
| Pseudocode | No | The paper provides an equation for the SGLD iterates (Eq. 3) but does not include a structured pseudocode block or algorithm listing. |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the authors' source code for their methodology is publicly available. |
| Open Datasets | Yes | [19] Y. Le Cun, C. Cortes, and C. J. C. Burges. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/. 2010. (Also mentions Fashion-MNIST and CIFAR-10, which are standard public datasets used in empirical results section) |
| Dataset Splits | No | The paper mentions 'heldout data' in Figure 1a, but it does not provide explicit training, validation, or test split percentages or sample counts, nor does it specify a cross-validation setup. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, or cloud computing instance types). |
| Software Dependencies | No | The paper mentions using the Adam optimizer, but it does not provide specific version numbers for any software libraries, frameworks, or environments used in the experiments. |
| Experiment Setup | Yes | The details of our model architectures, temperature, learning rate schedules and hyperparameter selections may be found in Appendix H. (Appendix H contains concrete details on model architectures, hyperparameters like learning rates, batch sizes, and temperature schedules.) |