reproducibilityindex.ai

Bayesian Uncertainty Estimation for Batch Normalized Deep Networks

Authors: Mattias Teye, Hossein Azizpour, Kevin Smith

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our approach is thoroughly validated by measuring the quality of uncertainty in a series of empirical experiments on different tasks. It outperforms baselines with strong statistical signiﬁcance, and displays competitive performance with recent Bayesian approaches.
Researcher Affiliation	Collaboration	1School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden 2Current address: Electronic Arts, SEED, Stockholm, Sweden. This work was carried out at Budbee AB. 3Science for Life Laboratory.
Pseudocode	Yes	Algorithm 1 MCBN Algorithm
Open Source Code	Yes	Code for reproducing our experiments is available at https://github.com/icml-mcbn/mcbn.
Open Datasets	Yes	Our quantitative analysis relies on CIFAR10 for image classiﬁcation and eight standard regression datasets, listed in Appendix Table 1. Publicly available from the UCI Machine Learning Repository (University of California, 2017) and Delve (Ghahramani, 1996)
Dataset Splits	Yes	Results were averaged over ﬁve random splits of 20% test and 80% training and cross-validation (CV) data. For each split, 5-fold CV by grid search with a RMSE minimization objective was used to ﬁnd training hyperparameters and optimal n.o. epochs, out of a maximum of 2000.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for experiments. It mentions 'Implementation was done in Tensor Flow', implying computation, but lacks hardware specifics.
Software Dependencies	No	The paper mentions 'Tensor Flow' and 'Adam optimizer' but does not specify their version numbers.
Experiment Setup	Yes	For the regression task, all models share a similar architecture: two hidden layers with 50 units each, and Re LU activations... For BN-based models, the hyperparameter grid consisted of a weight decay factor ranging from 0.1 to 1 15 by a log 10 scale, and a batch size range from 32 to 1024 by a log 2 scale. For DO-based models, the hyperparameter grid consisted of the same weight decay range, and dropout probabilities in {0.2, 0.1, 0.05, 0.01, 0.005, 0.001}. DO-based models used a batch size of 32 in all evaluations. ... Estimates for the predictive distribution were obtained by taking T = 500 stochastic forward passes through the network. ... We trained a Res Net32 architecture with a batch size of 32, learning rate of 0.1, weight decay of 0.0002, leaky Re LU slope of 0.1, and 5 residual units. SGD with momentum was used as the optimizer.