reproducibilityindex.ai

Weight Uncertainty in Neural Network

Authors: Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that this principled kind of regularisation yields comparable performance to dropout on MNIST classiﬁcation. We then demonstrate how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems, and how this weight uncertainty can be used to drive the exploration-exploitation trade-off in reinforcement learning.
Researcher Affiliation	Industry	Charles Blundell CBLUNDELL@GOOGLE.COM Julien Cornebise JUCOR@GOOGLE.COM Koray Kavukcuoglu KORAYK@GOOGLE.COM Daan Wierstra WIERSTRA@GOOGLE.COM Google Deep Mind
Pseudocode	Yes	Each step of optimisation proceeds as follows: 1. Sample ϵ N(0, I). 2. Let w = µ + log(1 + exp(ρ)) ϵ. 3. Let θ = (µ, ρ). 4. Let f(w, θ) = log q(w\|θ) log P(w)P(D\|w). 5. Calculate the gradient with respect to the mean µ = f(w, θ) w + f(w, θ) 6. Calculate the gradient with respect to the standard deviation parameter ρ ρ = f(w, θ) w ϵ 1 + exp( ρ) + f(w, θ) 7. Update the variational parameters: µ µ α µ (5) ρ ρ α ρ. (6)
Open Source Code	No	The paper does not contain any explicit statements about making source code available, nor does it provide links to a code repository.
Open Datasets	Yes	We trained networks of various sizes on the MNIST digits dataset (Le Cun and Cortes, 1998), consisting of 60,000 training and 10,000 testing pixel images of size 28 by 28.
Dataset Splits	Yes	We trained on 50,000 digits and used 10,000 digits as a validation set, whilst Hinton et al. (2012) trained on 60,000 digits and did not use a validation set.
Hardware Specification	No	The paper mentions that 'all of the operations used are readily implemented on a GPU' but does not specify any particular GPU model (e.g., NVIDIA A100, Tesla V100), CPU model, memory, or any other specific hardware configuration details.
Software Dependencies	No	The paper does not explicitly mention any specific software dependencies or their version numbers (e.g., Python, TensorFlow, PyTorch, scikit-learn, etc.) that would be necessary for replication.
Experiment Setup	Yes	We considered learning rates of 10^-3, 10^-4 and 10^-5 with minibatches of size 128. For Bayes by Backprop, we averaged over either 1, 2, 5, or 10 samples and considered π { 1/4}, log σ1 {0, 1, 2} and log σ2 {6, 7, 8}.