reproducibilityindex.ai

Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?

Authors: Maohao Shen, Jongha (Jon) Ryu, Soumya Ghosh, Yuheng Bu, Prasanna Sattigeri, Subhro Das, Gregory Wornell

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through all these analyses, we conclude that even when EDL methods are empirically effective on downstream tasks, this occurs despite their poor uncertainty quantification capabilities. Our investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity.
Researcher Affiliation	Collaboration	1Department of EECS, MIT, Cambridge, MA 02139 2MIT-IBM Watson AI Lab, IBM Research, Cambridge, MA 02142 3Department of ECE, University of Florida, Gainesville, FL 32611
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	1The code to replicate the experiments is available on https://github.com/maohaos2/EDL-Mirage.
Open Datasets	Yes	We consider two ID datasets: CIFAR10, and CIFAR100. For the OOD detection task, we select four OOD datasets for each ID dataset: we use SVHN, FMNIST, Tiny Image Net, and corrupted ID data. ... Fashion-MNIST ... SVHN ... Tiny Image Net (TIM)
Dataset Splits	Yes	For in-distribution datasets CIFAR10 and CIFAR100, we divide the original training data into two subsets: a training set and a validation set, using an 80%/20% split ratio.
Hardware Specification	Yes	All experiments are implemented in PyTorch using a Tesla V100 GPU with 32 GB memory.
Software Dependencies	No	The paper states 'All experiments are implemented in PyTorch' but does not specify a version number for PyTorch or other software dependencies.
Experiment Setup	Yes	The maximum training epochs are set to 50, 100, and 200 for 2-D Gaussian data, CIFAR10 and CIFAR100, respectively. ... The training batch size is set to 64, 64, and 256 for Gaussian data, CIFAR10 and CIFAR100, respectively. We use Adam optimizer without weight decay or learning rate schedule during model optimization. The learning rates of the optimizer are 1e-3,2.5e-4,2.5e-4 for Gaussian data, CIFAR10 and CIFAR100, respectively. The default hyper-parameter λ is set to 1e-4 for those EDL methods with regularizer.