Neural Variational Inference and Learning in Belief Networks

Authors: Andriy Mnih, Karol Gregor

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We performed two sets of experiments, with the first set intended to evaluate the effectiveness of our variance reduction techniques and to compare NVIL s performance to that of the wake-sleep algorithm. In the second set of experiments, we demonstrate NVIL s ability to handle larger real-world datasets by using it to train generative models of documents.
Researcher Affiliation Industry Andriy Mnih AMNIH@GOOGLE.COM Karol Gregor KAROLG@GOOGLE.COM Google Deep Mind
Pseudocode No The paper states, 'The algorithm for computing NVIL parameter updates using the variance reduction techniques described so far is provided in the supplementary material.' but does not include pseudocode or an algorithm block within the main document.
Open Source Code No The paper does not provide any explicit statements about releasing source code, nor does it include a link to a code repository for the described methodology.
Open Datasets Yes Our first set of experiments was performed on the binarized version of the MNIST dataset, which has become the standard benchmark for evaluating generative models of binary data. ... We used the binarization of Salakhutdinov & Murray (2008)... We trained two simple models on the 20 Newsgroups and Reuters Corpus Volume I (RCV1-v2) datasets, which have been used to evaluate similar models in (Salakhutdinov & Hinton, 2009b; Larochelle & Lauly, 2012).
Dataset Splits Yes The dataset consists of 70,000 28 28 binary images of handwritten digits, partitioned into a 60,000-image training set and 10,000-image test set. ... For each dataset, we created a validation set by removing a random subset of 100 observations from the training set.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., specific libraries or frameworks like TensorFlow/PyTorch with their versions).
Experiment Setup Yes We trained all models using stochastic gradient ascent using minibatches of 20 observations sampled randomly from the training data. The gradient estimates were computed using a single sample from the inference network. ... We used 3 10 4 as the learning rate for training models with NVIL on this dataset. ... Wake-sleep training used a learning rate of 1 10 4... We implemented each input-dependent baseline using a neural network with a single hidden layer of 100 tanh units. ... The learning rates which were 3 10 5 on 20 Newsgroups and 10 3 on RCV1.