Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Bayesian Deep Ensembles via the Neural Tangent Kernel
Authors: Bobby He, Balaji Lakshminarayanan, Yee Whye Teh
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, using finite width NNs we demonstrate that our Bayesian deep ensembles faithfully emulate the analytic posterior predictive when available, and can outperform standard deep ensembles in various out-of-distribution settings, for both regression and classification tasks. and 4 Experiments |
| Researcher Affiliation | Collaboration | Bobby He Department of Statistics University of Oxford EMAIL Balaji Lakshminarayanan Google Research Brain team EMAIL Yee Whye Teh Department of Statistics University of Oxford EMAIL |
| Pseudocode | Yes | Algorithm 1 NTKGP-param ensemble |
| Open Source Code | Yes | Code for this experiment is available at: https://github.com/bobby-he/bayesian-ntk. |
| Open Datasets | Yes | Flight Delays dataset [43], MNIST vs Not MNIST, CIFAR-10 vs SVHN |
| Dataset Splits | No | In order to obtain probabilistic predictions, we temperature scale our trained ensemble predictions with cross-entropy loss on a held-out validation set and tuned using the validation accuracy on a small set of values around the He initialisation. No specific split percentages or counts are provided for the validation set. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or cloud instances) are mentioned for running experiments. |
| Software Dependencies | No | init( ) will be standard parameterisation initialisation in the JAX library Neural Tangents [38] unless stated otherwise. No specific version numbers for JAX or Neural Tangents are provided. |
| Experiment Setup | Yes | For each ensemble method, we use MLP baselearners with two hidden layers of width 512, and erf activation. and The weight parameter initialisation variance σ2 W is tuned using the validation accuracy on a small set of values around the He initialisation, σ2 W =2, [44] for all classification experiments. and baselearners taking the Myrtle-10 CNN architecture [40] of channel-width 100. |