Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
Authors: Konstantinos Pitas
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show empirically that this approach gives negligible gains when modeling the posterior as a Gaussian with diagonal covariance known as the mean-field approximation. |
| Researcher Affiliation | Academia | 1 Ecole Polytechnique F ed erale de Lausanne, Switzerland. Correspondence to: Konstantinos Pitas <konstantinos.pitas@epfl.ch>. |
| Pseudocode | No | The paper presents mathematical derivations and descriptions of methods but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about the release of source code or links to a code repository. |
| Open Datasets | Yes | We tested 6 different datasets. These consist of the original MNIST-10 and CIFAR-10 (Krizhevsky & Hinton, 2010) datasets... MNIST(Le Cun & Cortes, 2010) |
| Dataset Splits | No | The paper mentions '50000 training samples' but does not explicitly state or describe a validation set or its split. It refers to empirical risk and a testing set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the 'Flipout estimator (Wen et al., 2018)' and the 'Adam optimizer (Kingma & Ba, 2014)', but it does not specify version numbers for these or any other software dependencies, which are necessary for reproducible descriptions. |
| Experiment Setup | Yes | For MNIST we do a grid search over β [1, 5] and λ [0.03, 0.1] while for CIFAR we search in β [1, 5] and λ [0.1, 0.3]. We used 5 epochs of training using the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 1e 1. |