Evidential Deep Learning to Quantify Classification Uncertainty

Authors: Murat Sensoy, Lance Kaplan, Melih Kandemir

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a set of experiments, we demonstrate that this technique outperforms state-of-the-art BNNs by a large margin on two applications where high-quality uncertainty modeling is of critical importance. Specifically, the predictive distribution of our model approaches the maximum entropy setting much closer than BNNs when fed with an input coming from a distribution different from that of the training samples. Figure 1 illustrates how sensibly our method reacts to the rotation of the input digits. As it is not trained to handle rotational invariance, it sharply reduces classification probabilities and increases the prediction uncertainty after circa 50 input rotation. The standard softmax keeps reporting high confidence for incorrect classes for high rotations. Lastly, we observe that our model is clearly more robust to adversarial attacks on two different benchmark data sets.
Researcher Affiliation Collaboration Murat Sensoy Department of Computer Science Ozyegin University, Turkey murat.sensoy@ozyegin.edu.tr Lance Kaplan US Army Research Lab Adelphi, MD 20783, USA lkaplan@ieee.org Melih Kandemir Bosch Center for Artificial Intelligence Robert-Bosch-Campus 1, 71272 Renningen, Germany melih.kandemir@bosch.com
Pseudocode No The paper describes the methods mathematically and in prose but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes The implementation and a demo application of our model is available under https://muratsensoy.github.io/uncertainty.html
Open Datasets Yes We trained the Le Net architecture for MNIST using... We tested these approaches in terms of prediction uncertainty on MNIST and CIFAR10 datasets.
Dataset Splits Yes We trained the Le Net architecture for MNIST using 20 and 50 filters with size 5 5 at the first and second convolutional layers, and 500 hidden units for the fully connected layer. Other methods are also trained using the same architecture with the priors and posteriors described in [24]. The classification performance of each method for the MNIST test set can be seen in Table 1. We trained the models on the MNIST train split using the same Le Net architecture and test on the not MNIST dataset... For training, we use the samples from the first five categories {dog, frog, horse, ship, truck} in the training set of CIFAR10.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for experiments.
Software Dependencies No All experiments are implemented in Tensorflow [1] and the Adam [17] optimizer has been used with default settings for training. The paper mentions software names but does not provide specific version numbers for TensorFlow or other libraries.
Experiment Setup Yes All experiments are implemented in Tensorflow [1] and the Adam [17] optimizer has been used with default settings for training. We trained the Le Net architecture for MNIST using 20 and 50 filters with size 5 5 at the first and second convolutional layers, and 500 hidden units for the fully connected layer. We achieve this by incorporating a Kullback-Leibler (KL) divergence term into our loss function that regularizes our predictive distribution by penalizing those divergences from the "I do not know" state that do not contribute to data fit. The loss with this regularizing term reads i=1 Li(Θ) + λt i=1 KL[D(pi| αi) || D(pi| 1, . . . , 1 )], where λt = min(1.0, t/10) [0, 1] is the annealing coefficient, t is the index of the current training epoch.