On the Importance of Gradient Norm in PAC-Bayesian Bounds

Authors: Itai Gat, Yossi Adi, Alex Schwing, Tamir Hazan

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply the proposed bound on Bayesian deep nets and empirically analyze the effect of this new loss-gradient norm term on different neural architectures. In this section, we evaluate our PAC-Bayesian bounds experimentally, both for linear and non-linear models.
Researcher Affiliation Collaboration Itai Gat1, Yossi Adi2,3, Alexander Schwing4, Tamir Hazan1 1 Technion 2 FAIR Team, Meta AI Research 3 The Hebrew University of Jerusalem 4University of Illinois at Urbana-Champaign
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes We use ten different architectures (Res Net18, Pre Act Resnet18, Goog Le Net, VGG11, VGG13, VGG16, VGG19, Dense Net121, Mobile Net, Efficient Net B0) on CIFAR10 and CIFAR100 [Krizhevsky, 2009, Simonyan and Zisserman, 2014, Szegedy et al., 2015, He et al., 2016, Huang et al., 2017, Howard et al., 2017, Tan and Le, 2019].
Dataset Splits No The paper states 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes]' and 'The complete experimental setup can be found in the Appendix.', but the specific dataset split information is not provided in the main paper text.
Hardware Specification No The paper states 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes]' in the reproducibility checklist, but it does not provide specific hardware details (e.g., GPU/CPU models, memory) within the main text.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup Yes In all settings we used λ = m and σ2 p = 0.01.