Activation-level uncertainty in deep neural networks

Authors: Pablo Morales-Alvarez, Daniel Hernández-Lobato, Rafael Molina, José Miguel Hernández-Lobato

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, au NN is compared to BNN, f BNN (Sun et al., 2019) and DSVI DGP (Salimbeni & Deisenroth, 2017).
Researcher Affiliation Academia Pablo Morales-Álvarez Department of Computer Science and AI University of Granada, Spain pablomorales@decsai.ugr.es Daniel Hernández-Lobato Department of Computer Science Universidad Autónoma de Madrid, Spain Rafael Molina Department of Computer Science and AI University of Granada, Spain José Miguel Hernández-Lobato Department of Engineering University of Cambridge, UK Alan Turing Institute, London, UK
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code Yes Anonymized code for au NN is provided in the supplementary material, along with a script to run it for the 1D illustrative example of Section 3.1.
Open Datasets Yes UCI REGRESSION DATASETS WITH GAP SPLITS... Here we focus on two large scale classification datasets (up to 10^7 instances), and additional metrics that account for uncertainty calibration are reported. We use the well-known particle physics binary classification sets HIGGS (N = 11M, D = 28) and SUSY (N = 5M, D = 18) (Baldi et al., 2014).
Dataset Splits Yes Standard splits are not appropriate to evaluate the quality of uncertainty estimates for in-between data, since both train and test sets may cover the space equally. This motivated the introduction of gap splits (Foong et al., 2019). Namely, a set with D dimensions admits D such train-test partitions by considering each dimension, sorting the points according to its value, and selecting the middle 1/3 for test (and the outer 2/3 for training), see Figure 2c. ... In the case of standard splits, each dataset uses 10 random 90%-10% train-test splits. ... Three random train-test splits are used. In both datasets, 500000 instances are used for test (which leaves 10.5M and 4.5M training instances for HIGGS and SUSY, respectively).
Hardware Specification Yes All the experiments were run on a NVIDIA Tesla P100.
Software Dependencies No No specific version numbers for software dependencies (e.g., Python, PyTorch, TensorFlow, GPflow versions) are provided.
Experiment Setup Yes All the experiments were run on a NVIDIA Tesla P100. In order to predict, all the methods utilize 100 test samples in all the experiments. Details for each section are provided below. ... All the methods use two layers (i.e. one hidden layer). The hidden layer has D = 25 units in all cases. ... The methods are trained during 5000 epochs with the whole dataset (no mini-batches). ... The methods use L = 2, 3 layers. In all cases, the hidden layers have D = 50 units. ... The methods are trained during 10000 epochs, with a mini-batch size that depends on the size of the dataset. For those with fewer than 5000 instances (i.e. Boston, Concrete, Energy, Wine and Yacht), the mini-batch size is 500. For those with more than 5000 (i.e. Naval), the mini-batch size is 5000. ... au NN is trained during 5000 epochs, with a mini-batch size of 5000 (20000 epochs are used for DGP, as proposed by the authors (Salimbeni & Deisenroth, 2017)). ... All the methods are trained during 100 epochs, with a mini-batch size of 5000. ... Throughout the work, we use the Adam Optimizer (Kingma & Ba, 2014) with default parameters and learning rate of 0.001. ... Appendix A: PRACTICAL SPECIFICATIONS FOR AUNN [provides initialization details].