reproducibilityindex.ai

Evidential Deep Learning to Quantify Classification Uncertainty

Authors: Murat Sensoy, Lance Kaplan, Melih Kandemir

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In a set of experiments, we demonstrate that this technique outperforms state-of-the-art BNNs by a large margin on two applications where high-quality uncertainty modeling is of critical importance. Speciﬁcally, the predictive distribution of our model approaches the maximum entropy setting much closer than BNNs when fed with an input coming from a distribution different from that of the training samples. Figure 1 illustrates how sensibly our method reacts to the rotation of the input digits. As it is not trained to handle rotational invariance, it sharply reduces classiﬁcation probabilities and increases the prediction uncertainty after circa 50 input rotation. The standard softmax keeps reporting high conﬁdence for incorrect classes for high rotations. Lastly, we observe that our model is clearly more robust to adversarial attacks on two different benchmark data sets.
Researcher Affiliation	Collaboration	Murat Sensoy Department of Computer Science Ozyegin University, Turkey murat.sensoy@ozyegin.edu.tr Lance Kaplan US Army Research Lab Adelphi, MD 20783, USA lkaplan@ieee.org Melih Kandemir Bosch Center for Artiﬁcial Intelligence Robert-Bosch-Campus 1, 71272 Renningen, Germany melih.kandemir@bosch.com
Pseudocode	No	The paper describes the methods mathematically and in prose but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The implementation and a demo application of our model is available under https://muratsensoy.github.io/uncertainty.html
Open Datasets	Yes	We trained the Le Net architecture for MNIST using... We tested these approaches in terms of prediction uncertainty on MNIST and CIFAR10 datasets.
Dataset Splits	Yes	We trained the Le Net architecture for MNIST using 20 and 50 ﬁlters with size 5 5 at the ﬁrst and second convolutional layers, and 500 hidden units for the fully connected layer. Other methods are also trained using the same architecture with the priors and posteriors described in [24]. The classiﬁcation performance of each method for the MNIST test set can be seen in Table 1. We trained the models on the MNIST train split using the same Le Net architecture and test on the not MNIST dataset... For training, we use the samples from the ﬁrst ﬁve categories {dog, frog, horse, ship, truck} in the training set of CIFAR10.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for experiments.
Software Dependencies	No	All experiments are implemented in Tensorﬂow [1] and the Adam [17] optimizer has been used with default settings for training. The paper mentions software names but does not provide specific version numbers for TensorFlow or other libraries.
Experiment Setup	Yes	All experiments are implemented in Tensorﬂow [1] and the Adam [17] optimizer has been used with default settings for training. We trained the Le Net architecture for MNIST using 20 and 50 ﬁlters with size 5 5 at the ﬁrst and second convolutional layers, and 500 hidden units for the fully connected layer. We achieve this by incorporating a Kullback-Leibler (KL) divergence term into our loss function that regularizes our predictive distribution by penalizing those divergences from the "I do not know" state that do not contribute to data ﬁt. The loss with this regularizing term reads i=1 Li(Θ) + λt i=1 KL[D(pi\| αi) \|\| D(pi\| 1, . . . , 1 )], where λt = min(1.0, t/10) [0, 1] is the annealing coefﬁcient, t is the index of the current training epoch.