reproducibilityindex.ai

Multidimensional Uncertainty-Aware Evidential Neural Networks

Authors: Yibo Hu, Yuzhe Ou, Xujiang Zhao, Jin-Hee Cho, Feng Chen7815-7822

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Via extensive empirical experiments based on both synthetic and real-world datasets, we demonstrated that the estimation of uncertainty by WENN can significantly help distinguish OOD samples from boundary samples. WENN outperformed in OOD detection when compared with other competitive counterparts.
Researcher Affiliation	Academia	1 Department of Computer Science, The University of Texas at Dallas, Richardson, TX, USA 2 Department of Computer Science, Virginia Tech, Falls Church, VA, USA {yibo.hu, yuzhe.ou, xujiang.zhao}@utdallas.edu, jicho@vt.edu, feng.chen@utdallas.edu
Pseudocode	Yes	Algorithm 1 Alternating minimization for WGAN and ENN
Open Source Code	Yes	For more details refer to Appendix and our source code 1. 1https://github.com/snowood1/wenn
Open Datasets	Yes	We followed the same experiments in (Sensoy et al. 2020) on MNIST (Le Cun et al. 1998) and CIFAR10 (Krizhevsky 2012): (1) For the MNIST dataset, we used the same Le Net-5 architecture from (Sensoy et al. 2020). We trained the model on MNIST training set and tested on MNIST testing set as ID samples and not MNIST (Bulatov, Y. 2011) as OOD samples; and (2) For the CIFAR10 dataset, we used Res Net-20 (He et al. 2016) as a classiﬁer in all the models considered in this work. We trained on the samples for the ﬁrst ﬁve categories {airplane, automobile, bird, cat, deer} in the CIFAR10 training set (i.e., ID), while using the other ﬁve categories {ship, truck, dog, frog, horse} as testing OODs. ... We used Fashion MNIST (Xiao, Rasul, and Vollgraf 2017), not MNIST, CIFAR10, CIFAR100 (Krizhevsky 2012), SVHN (Netzer et al. 2011), the classroom class of LSUN (i.e., LSUN CR) (Yu et al. 2015) and uniform noise as ID or OOD datasets.
Dataset Splits	No	For WENN, we set β = 0.1, nd = 2, ne = 1, m = 256, learning rate = 1e-4 in Algorithm 1 in all the experiments, which were fine-tuned considering the performance of both the OOD detection and ID classification accuracy. ... The paper describes training and testing splits for datasets like MNIST and CIFAR10 (e.g., 'trained the model on MNIST training set and tested on MNIST testing set'), but it does not explicitly define or refer to a separate 'validation' dataset split with specific percentages or counts.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions deep learning architectures like 'Le Net-5' and 'Res Net-20', but it does not specify any software dependencies (e.g., libraries, frameworks) with version numbers that would be needed for replication.
Experiment Setup	Yes	For WENN, we set β = 0.1, nd = 2, ne = 1, m = 256, learning rate = 1e-4 in Algorithm 1 in all the experiments, which were ﬁne-tuned considering the performance of both the OOD detection and ID classiﬁcation accuracy.