On the Importance of Gradient Norm in PAC-Bayesian Bounds
Authors: Itai Gat, Yossi Adi, Alex Schwing, Tamir Hazan
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply the proposed bound on Bayesian deep nets and empirically analyze the effect of this new loss-gradient norm term on different neural architectures. In this section, we evaluate our PAC-Bayesian bounds experimentally, both for linear and non-linear models. |
| Researcher Affiliation | Collaboration | Itai Gat1, Yossi Adi2,3, Alexander Schwing4, Tamir Hazan1 1 Technion 2 FAIR Team, Meta AI Research 3 The Hebrew University of Jerusalem 4University of Illinois at Urbana-Champaign |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | We use ten different architectures (Res Net18, Pre Act Resnet18, Goog Le Net, VGG11, VGG13, VGG16, VGG19, Dense Net121, Mobile Net, Efficient Net B0) on CIFAR10 and CIFAR100 [Krizhevsky, 2009, Simonyan and Zisserman, 2014, Szegedy et al., 2015, He et al., 2016, Huang et al., 2017, Howard et al., 2017, Tan and Le, 2019]. |
| Dataset Splits | No | The paper states 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]' and 'The complete experimental setup can be found in the Appendix.', but the specific dataset split information is not provided in the main paper text. |
| Hardware Specification | No | The paper states 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes]' in the reproducibility checklist, but it does not provide specific hardware details (e.g., GPU/CPU models, memory) within the main text. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiment. |
| Experiment Setup | Yes | In all settings we used λ = m and σ2 p = 0.01. |