A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods
Authors: Veit David Wild, Sahra Ghalebikesabi, Dino Sejdinovic, Jeremias Knoblauch
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments Since the paper s primary focus is on theory, we use two experiments to reinforce some of the predictions it makes in previous sections, and a third experiment that shows why in direct contradiction to a naive interpretation of the presented theory it is typically difficult to beat simple DEs. More details about the conducted experiments can be found in Appendix G. The code is available on https://github.com/sghalebikesabi/GVI-WGF. |
| Researcher Affiliation | Academia | Veit D. Wild University of Oxford Sahra Ghalebikesabi University of Oxford Dino Sejdinovic University of Adelaide Jeremias Knoblauch University College London |
| Pseudocode | No | The paper describes algorithmic steps in narrative text (e.g., 'Step 1: Sample NE N particles...', 'Step 2: Evolve the particle...') rather than in a formally structured pseudocode or algorithm block. |
| Open Source Code | Yes | The code is available on https://github.com/sghalebikesabi/GVI-WGF. |
| Open Datasets | Yes | Table 1 compares the average (Gaussian) negative log likelihood in the test set for the three ID-GVI methods on some UCI-regression data sets (Lichman, 2013). |
| Dataset Splits | Yes | We split each data set into train (81% of samples), validation (9% of samples), and test set (10% of samples). |
| Hardware Specification | Yes | While the final experimental results can be run within approximately an hour on a single Ge Force RTX 3090 GPU, the complete compute needed for the final results, debugging runs, and sweeps amounts to around 9 days. |
| Software Dependencies | No | The paper mentions common machine learning practices (e.g., neural networks), implying the use of associated software frameworks, but does not specify any software dependencies with version numbers (e.g., 'Python 3.x', 'PyTorch 1.x', 'CUDA x.x'). |
| Experiment Setup | Yes | We train 5 one-hidden-layer neural networks fθ with 50 hidden nodes for 40 epochs. ... Initialisation: Kaiming intilisation, i.e. for each layer l {1, ...L} that maps features with dimensionality nl 1 into dimensionality nl, we sample Ql,0 N(0, 2/nl) Reg. parameter: λDLE = 10 4, λDRLE = 10 4, λ DRLE = 10 2 Step size: η = 0.1, Iterations: K = 10,000 |