reproducibilityindex.ai

One Step Closer to Unbiased Aleatoric Uncertainty Estimation

Authors: Wang Zhang, Ziwen Martin Ma, Subhro Das, Tsui-Wei Lily Weng, Alexandre Megretski, Luca Daniel, Lam M. Nguyen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method. 4 Experiments To showcase and validate the effectiveness of the DVA technique, we start with a simple regression example.
Researcher Affiliation	Collaboration	1 Massachusetts Institute of Technology, Cambridge, MA, USA 2 Harvard University, Cambridge, MA, USA 3 MIT-IBM Watson AI Lab, IBM Research, Cambridge, MA, USA 4 University of California San Diego, San Diego, CA, USA 5 IBM Research, Thomas J. Watson Research Center, Yorktown Heights, NY, USA
Pseudocode	Yes	The key steps of the algorithm are listed below, with the detailed algorithm being deferred to Appendix B.
Open Source Code	Yes	Source code available at https://github.com/wz16/DVA. Please refer to ar Xiv for full technical appendix.
Open Datasets	Yes	The NYU Depth v2 dataset (Silberman et al. 2012) contains 27k RGB-depth image pairs. utilizing the APPA-REAL database (Agustsson et al. 2017) (Figure 4 shows two samples) consisting of 7591 images.
Dataset Splits	No	The paper mentions training data for the toy example (“A total of 1000 samples are drawn from x [1, 9] for training”) but does not provide specific train/validation/test split percentages, sample counts, or detailed splitting methodology for all experiments, especially for the real-world datasets like NYU Depth v2 or APPA-REAL.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or library names with version numbers needed to replicate the experiment.
Experiment Setup	Yes	We employ two stochastic prediction models: an ensemble model with 5 base learners and Bayesian neural network (BNN, Hern andez-Lobato and Adams 2015) with a sampling size of 5. For a fair comparison between the VA and DVA, the prediction model is pre-trained with MSE loss ahead of the variance training. We employ a Res Next-50 (32 4d) pretrained on Image Net, replace the last layer and fine-tune it on the APPA-REAL database with averaged apparent age as labels. To estimate uncertainty, we append an additional linear layer to the second last layer of the network, which maps its outputs (with a dimension of 2048) to a logarithmic uncertainty measure. We then fine-tune only this linear layer.