One Step Closer to Unbiased Aleatoric Uncertainty Estimation
Authors: Wang Zhang, Ziwen Martin Ma, Subhro Das, Tsui-Wei Lily Weng, Alexandre Megretski, Luca Daniel, Lam M. Nguyen
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method. 4 Experiments To showcase and validate the effectiveness of the DVA technique, we start with a simple regression example. |
| Researcher Affiliation | Collaboration | 1 Massachusetts Institute of Technology, Cambridge, MA, USA 2 Harvard University, Cambridge, MA, USA 3 MIT-IBM Watson AI Lab, IBM Research, Cambridge, MA, USA 4 University of California San Diego, San Diego, CA, USA 5 IBM Research, Thomas J. Watson Research Center, Yorktown Heights, NY, USA |
| Pseudocode | Yes | The key steps of the algorithm are listed below, with the detailed algorithm being deferred to Appendix B. |
| Open Source Code | Yes | Source code available at https://github.com/wz16/DVA. Please refer to ar Xiv for full technical appendix. |
| Open Datasets | Yes | The NYU Depth v2 dataset (Silberman et al. 2012) contains 27k RGB-depth image pairs. utilizing the APPA-REAL database (Agustsson et al. 2017) (Figure 4 shows two samples) consisting of 7591 images. |
| Dataset Splits | No | The paper mentions training data for the toy example (“A total of 1000 samples are drawn from x [1, 9] for training”) but does not provide specific train/validation/test split percentages, sample counts, or detailed splitting methodology for all experiments, especially for the real-world datasets like NYU Depth v2 or APPA-REAL. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies or library names with version numbers needed to replicate the experiment. |
| Experiment Setup | Yes | We employ two stochastic prediction models: an ensemble model with 5 base learners and Bayesian neural network (BNN, Hern andez-Lobato and Adams 2015) with a sampling size of 5. For a fair comparison between the VA and DVA, the prediction model is pre-trained with MSE loss ahead of the variance training. We employ a Res Next-50 (32 4d) pretrained on Image Net, replace the last layer and fine-tune it on the APPA-REAL database with averaged apparent age as labels. To estimate uncertainty, we append an additional linear layer to the second last layer of the network, which maps its outputs (with a dimension of 2048) to a logarithmic uncertainty measure. We then fine-tune only this linear layer. |