reproducibilityindex.ai

Dropout Inference in Bayesian Neural Networks with Alpha-divergences

Authors: Yingzhen Li, Yarin Gal

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the reparameterised BB-α on Bayesian NNs with the dropout approximation. We assess the proposed inference in regression and classiﬁcation tasks on standard benchmarking datasets, comparing different values of α.
Researcher Affiliation	Academia	Yingzhen Li 1 Yarin Gal 1 2 1University of Cambridge, UK 2The Alan Turing Institute, UK.
Pseudocode	Yes	def softmax_cross_ent_with_mc_logits(alpha): def loss(y_true, mc_logits): # mc_logits: MC samples of shape Mx Kx D mc_log_softmax = mc_logits \ K.max(mc_logits, axis=2, keepdims=True) mc_log_softmax = mc_log_softmax \ logsumexp(mc_log_softmax, 2) mc_ll = K.sum(y_truemc_log_softmax,-1) return -1./alpha (logsumexp(alpha * \ mc_ll, 1) + K.log(1.0 / K_mc)) return loss Figure 1. Code snippet for our induced classiﬁcation loss.
Open Source Code	No	A code snippet for our induced loss is given in Figure 1, with more details in the appendix. (This implies a small part is shown, not the full codebase, and no explicit release statement or link is present.)
Open Datasets	Yes	We use benchmark UCI datasets2 that have been tested in related literature. 2http://archive.ics.uci.edu/ml/datasets.html and We further experiment with a classiﬁcation task, comparing the accuracy of the various α values on the MNIST benchmark (Le Cun & Cortes, 1998).
Dataset Splits	No	We summarise the test negative log-likelihood (LL) and RMSE with standard error (across different random splits, the lower the better) for selected datasets in Figure 2 and 3, respectively. and The adversarial examples are generated on MNIST test data that is normalised to be in the range [0, 1]. (No explicit percentages or counts for train/val/test splits).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	Implementing this induced loss with Keras (Chollet, 2015) is as simple as a few lines of Python. (No specific version number for Keras or Python is provided).
Experiment Setup	Yes	The model is a single-layer neural network with 50 ReLU units for all datasets except for Protein and Year, which use 100 units. We consider α {0.0, 0.5, 1.0}... MC approximation with K = 10 samples is also deployed... We used dropout probability 0.5 and α {0, 0.5, 1}. Again, we use K = 10 samples at training time for all α values, and Ktest = 100 samples at test time. We use weight decay 10 6...