Stochastic Backpropagation and Approximate Inference in Deep Generative Models

Authors: Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate on several real-world data sets that by using stochastic backpropagation and variational inference, we obtain models that are able to generate realistic samples of data, allow for accurate imputations of missing data, and provide a useful tool for high-dimensional data visualisation.
Researcher Affiliation Industry Danilo Jimenez Rezende DANILOR@GOOGLE.COM Shakir Mohamed SHAKIR@GOOGLE.COM Daan Wierstra DAANW@GOOGLE.COM Google Deep Mind, London, United Kingdom
Pseudocode Yes Algorithm 1 Learning in DLGMs
Open Source Code No The paper does not contain an explicit statement about releasing source code or a direct link to a code repository.
Open Datasets Yes MNIST digits using the binarised data set from Larochelle & Murray (2011)., The NORB object recognition data set, The CIFAR10 natural images data set, The Frey faces data set, the street view house numbers (SVHN) data set (Netzer et al., 2011).
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and test sets.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes The model consists of two deterministic layers with 200 hidden units and a stochastic layer of 200 latent variables. We use minibatches of 200 observations and trained the model using stochastic backpropagation. For this experiment, the generative model consists of 100 latent variables feeding into a deterministic layer of 300 nodes, which then feeds to the observation likelihood. We use the same structure for the recognition model.