Flexible and accurate inference and learning for deep generative models

Authors: Eszter Vértes, Maneesh Sahani

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that the new algorithm outperforms current state-of-the-art methods on synthetic, natural image patch and the MNIST data sets.
Researcher Affiliation Academia Eszter Vértes Maneesh Sahani Gatsby Computational Neuroscience Unit University College London London, W1T 4JG {eszter, maneesh}@gatsby.ucl.ac.uk
Pseudocode Yes Algorithm 1 DDC Helmholtz Machine training Initialise θ repeat Sleep phase: for s = 1 . . . S, sample: z(s) L , ..., z(s) 1 , x(s) pθ(x, z1, ..., z L) update recognition parameters {φl} [eq. 7] update function approximators {αl, βl} [appendix] Wake phase: x {minibatch} evaluate rl(x, φ) [eq. 8] update θ: θ [ θF(x, r(x, φ), θ) [appendix] until |[ θF| < threshold
Open Source Code No The paper does not provide any specific links to open-source code or an explicit statement about its public availability.
Open Datasets Yes We tested the scalability of the DDC-HM by applying it to a natural image data set [22]. We used the binarised MNIST dataset of 28x28 images of handwritten digits [23].
Dataset Splits No The paper mentions using synthetic data for training and a test set for MNIST (N=10000), but it does not specify the explicit training, validation, and test dataset splits (e.g., percentages, counts, or detailed methodology for partitioning) that would allow reproduction of the data partitioning.
Hardware Specification No The paper does not provide any specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions the "Adam optimiser" [19] but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their specific versions).
Experiment Setup Yes We used a recognition model with a hidden layer of size 100, and K1 = K2 = 100 encoding functions for each latent layer, with 200 sleep samples, and learned the parameters of the conditional distributions p(x|z1) and p(z1|z2) while keeping the prior on z2 fixed (m=3, σ=0.1). We initialised each model to the true generative parameters and ran the algorithms until convergence (1000 epochs, learning rate: 10−4, using the Adam optimiser; [19]).