reproducibilityindex.ai

Depth Uncertainty in Neural Networks

Authors: Javier Antoran, James Allingham, José Miguel Hernández-Lobato

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach on real-world regression and image classiﬁcation tasks. Our approach provides uncertainty calibration, robustness to dataset shift, and accuracies competitive with more computationally expensive baselines.
Researcher Affiliation	Collaboration	Javier Antoránú University of Cambridge ja666@cam.ac.uk James Urquhart Allinghamú University of Cambridge jua23@cam.ac.uk José Miguel Hernández-Lobato University of Cambridge Microsoft Research The Alan Turing Institute jmh233@cam.ac.uk
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/cambridge-mlg/DUN.
Open Datasets	Yes	We evaluate all methods on UCI regression datasets using standard (Hernández-Lobato and Adams, 2015) and gap splits (Foong et al., 2019b). We also use the large-scale non-stationary ﬂight delay dataset, preprocessed by Hensman et al. (2013). ... Rotated MNIST Following Snoek et al. (2019), we train all methods on MNIST... Corrupted CIFAR Again following Snoek et al. (2019), we train models on CIFAR10... Following Nalisnick et al. (2019b), we use CIFAR10 and SVHN as in and out of distribution datasets.
Dataset Splits	Yes	We evaluate all methods on UCI regression datasets using standard (Hernández-Lobato and Adams, 2015) and gap splits (Foong et al., 2019b). ... Following Deisenroth and Ng (2015), we train on the ﬁrst 2M data points and test on the subsequent 100k.
Hardware Specification	Yes	Batch size is 256, split over 2 Nvidia P100 GPUs.
Software Dependencies	No	The paper mentions 'Py Torch' but does not provide specific version numbers for PyTorch or any other software libraries used for reproducibility.
Experiment Setup	Yes	We select all hyperparameters, including NN depth, using Bayesian optimisation with Hyper Band (Falkner et al., 2018). ... We use default Py Torch training hyperparameters2 for all methods. We set per-dataset LR schedules. We use 5 element (standard) deep ensembles, as suggested by Snoek et al. (2019), and 10 dropout samples. ... For DUNs, our prior over depth is uniform over the ﬁrst 13 residual blocks. ... Batch size is 256, split over 2 Nvidia P100 GPUs.