Multidimensional Uncertainty-Aware Evidential Neural Networks

Authors: Yibo Hu, Yuzhe Ou, Xujiang Zhao, Jin-Hee Cho, Feng Chen7815-7822

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Via extensive empirical experiments based on both synthetic and real-world datasets, we demonstrated that the estimation of uncertainty by WENN can significantly help distinguish OOD samples from boundary samples. WENN outperformed in OOD detection when compared with other competitive counterparts.
Researcher Affiliation Academia 1 Department of Computer Science, The University of Texas at Dallas, Richardson, TX, USA 2 Department of Computer Science, Virginia Tech, Falls Church, VA, USA {yibo.hu, yuzhe.ou, xujiang.zhao}@utdallas.edu, jicho@vt.edu, feng.chen@utdallas.edu
Pseudocode Yes Algorithm 1 Alternating minimization for WGAN and ENN
Open Source Code Yes For more details refer to Appendix and our source code 1. 1https://github.com/snowood1/wenn
Open Datasets Yes We followed the same experiments in (Sensoy et al. 2020) on MNIST (Le Cun et al. 1998) and CIFAR10 (Krizhevsky 2012): (1) For the MNIST dataset, we used the same Le Net-5 architecture from (Sensoy et al. 2020). We trained the model on MNIST training set and tested on MNIST testing set as ID samples and not MNIST (Bulatov, Y. 2011) as OOD samples; and (2) For the CIFAR10 dataset, we used Res Net-20 (He et al. 2016) as a classifier in all the models considered in this work. We trained on the samples for the first five categories {airplane, automobile, bird, cat, deer} in the CIFAR10 training set (i.e., ID), while using the other five categories {ship, truck, dog, frog, horse} as testing OODs. ... We used Fashion MNIST (Xiao, Rasul, and Vollgraf 2017), not MNIST, CIFAR10, CIFAR100 (Krizhevsky 2012), SVHN (Netzer et al. 2011), the classroom class of LSUN (i.e., LSUN CR) (Yu et al. 2015) and uniform noise as ID or OOD datasets.
Dataset Splits No For WENN, we set β = 0.1, nd = 2, ne = 1, m = 256, learning rate = 1e-4 in Algorithm 1 in all the experiments, which were fine-tuned considering the performance of both the OOD detection and ID classification accuracy. ... The paper describes training and testing splits for datasets like MNIST and CIFAR10 (e.g., 'trained the model on MNIST training set and tested on MNIST testing set'), but it does not explicitly define or refer to a separate 'validation' dataset split with specific percentages or counts.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions deep learning architectures like 'Le Net-5' and 'Res Net-20', but it does not specify any software dependencies (e.g., libraries, frameworks) with version numbers that would be needed for replication.
Experiment Setup Yes For WENN, we set β = 0.1, nd = 2, ne = 1, m = 256, learning rate = 1e-4 in Algorithm 1 in all the experiments, which were fine-tuned considering the performance of both the OOD detection and ID classification accuracy.