Uncertainty Estimation by Density Aware Evidential Deep Learning

Authors: Taeseong Yoon, Heeyoung Kim

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental DAEDL demonstrates state-of-the-art performance across diverse downstream tasks related to uncertainty estimation and classification. We conducted extensive experiments across various downstream tasks related to uncertainty estimation and classification.
Researcher Affiliation Academia 1Department of Industrial and Systems Engineering, KAIST, Daejeon, Republic of Korea. Correspondence to: Heeyoung Kim <heeyoungkim@kaist.ac.kr>.
Pseudocode Yes The training procedure for DAEDL is presented in Algorithm 1 in Appendix B. The density estimation algorithm for DAEDL is provided in Algorithm 2 in Appendix B. The algorithm for the prediction of DAEDL is provided in Algorithm 3 in Appendix B.
Open Source Code Yes The code for our model is available at https://github.com/Taeseong Yoon/DAEDL.
Open Datasets Yes To evaluate the OOD detection performance, we used MNIST (Le Cun, 1998) and CIFAR-10 (Krizhevsky et al., 2009) as ID datasets. These are well-known public datasets with clear citations.
Dataset Splits Yes We partitioned the training samples into a training set and a validation set with a ratio of 0.8 : 0.2 [for MNIST]. We partitioned the training samples into a training set and a validation set with a ratio of 0.95 : 0.05 [for CIFAR-10]. To prevent overfitting, early stopping based on the validation loss was implemented for both datasets.
Hardware Specification No The paper does not explicitly describe the hardware used for its experiments. It mentions 'We implemented a configuration of 3 convolutional layers and 3 dense layers (Conv Net)' and 'we used VGG-16' but no specific GPU, CPU, or cloud computing instance details.
Software Dependencies No The paper mentions 'The Adam optimizer and Lambda LR scheduler were employed for both datasets.' but does not provide specific version numbers for these software components or any other libraries used.
Experiment Setup Yes In the case of MNIST, training extended up to 50 epochs with a batch size of 64, and for CIFAR-10, we trained up to 100 epochs with the same batch size. The learning rate (η) and regularization parameter (λ) were determined through a grid search, yielding the optimal values of (10−3, 5 × 10−2). For VGG-16, dropout with a rate of 0.5 was applied. Table 7: B, pdrop, η, lrλ, λ, and Tmax denote the batch size, dropout rate, learning rate, parameter of the scheduler, regularization parameter, and maximum epoch, respectively.