One Step Closer to Unbiased Aleatoric Uncertainty Estimation

Authors: Wang Zhang, Ziwen Martin Ma, Subhro Das, Tsui-Wei Lily Weng, Alexandre Megretski, Luca Daniel, Lam M. Nguyen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method. 4 Experiments To showcase and validate the effectiveness of the DVA technique, we start with a simple regression example.
Researcher Affiliation Collaboration 1 Massachusetts Institute of Technology, Cambridge, MA, USA 2 Harvard University, Cambridge, MA, USA 3 MIT-IBM Watson AI Lab, IBM Research, Cambridge, MA, USA 4 University of California San Diego, San Diego, CA, USA 5 IBM Research, Thomas J. Watson Research Center, Yorktown Heights, NY, USA
Pseudocode Yes The key steps of the algorithm are listed below, with the detailed algorithm being deferred to Appendix B.
Open Source Code Yes Source code available at https://github.com/wz16/DVA. Please refer to ar Xiv for full technical appendix.
Open Datasets Yes The NYU Depth v2 dataset (Silberman et al. 2012) contains 27k RGB-depth image pairs. utilizing the APPA-REAL database (Agustsson et al. 2017) (Figure 4 shows two samples) consisting of 7591 images.
Dataset Splits No The paper mentions training data for the toy example (“A total of 1000 samples are drawn from x [1, 9] for training”) but does not provide specific train/validation/test split percentages, sample counts, or detailed splitting methodology for all experiments, especially for the real-world datasets like NYU Depth v2 or APPA-REAL.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies or library names with version numbers needed to replicate the experiment.
Experiment Setup Yes We employ two stochastic prediction models: an ensemble model with 5 base learners and Bayesian neural network (BNN, Hern andez-Lobato and Adams 2015) with a sampling size of 5. For a fair comparison between the VA and DVA, the prediction model is pre-trained with MSE loss ahead of the variance training. We employ a Res Next-50 (32 4d) pretrained on Image Net, replace the last layer and fine-tune it on the APPA-REAL database with averaged apparent age as labels. To estimate uncertainty, we append an additional linear layer to the second last layer of the network, which maps its outputs (with a dimension of 2048) to a logarithmic uncertainty measure. We then fine-tune only this linear layer.