Flexible and accurate inference and learning for deep generative models
Authors: Eszter Vértes, Maneesh Sahani
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that the new algorithm outperforms current state-of-the-art methods on synthetic, natural image patch and the MNIST data sets. |
| Researcher Affiliation | Academia | Eszter Vértes Maneesh Sahani Gatsby Computational Neuroscience Unit University College London London, W1T 4JG {eszter, maneesh}@gatsby.ucl.ac.uk |
| Pseudocode | Yes | Algorithm 1 DDC Helmholtz Machine training Initialise θ repeat Sleep phase: for s = 1 . . . S, sample: z(s) L , ..., z(s) 1 , x(s) pθ(x, z1, ..., z L) update recognition parameters {φl} [eq. 7] update function approximators {αl, βl} [appendix] Wake phase: x {minibatch} evaluate rl(x, φ) [eq. 8] update θ: θ [ θF(x, r(x, φ), θ) [appendix] until |[ θF| < threshold |
| Open Source Code | No | The paper does not provide any specific links to open-source code or an explicit statement about its public availability. |
| Open Datasets | Yes | We tested the scalability of the DDC-HM by applying it to a natural image data set [22]. We used the binarised MNIST dataset of 28x28 images of handwritten digits [23]. |
| Dataset Splits | No | The paper mentions using synthetic data for training and a test set for MNIST (N=10000), but it does not specify the explicit training, validation, and test dataset splits (e.g., percentages, counts, or detailed methodology for partitioning) that would allow reproduction of the data partitioning. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions the "Adam optimiser" [19] but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their specific versions). |
| Experiment Setup | Yes | We used a recognition model with a hidden layer of size 100, and K1 = K2 = 100 encoding functions for each latent layer, with 200 sleep samples, and learned the parameters of the conditional distributions p(x|z1) and p(z1|z2) while keeping the prior on z2 fixed (m=3, σ=0.1). We initialised each model to the true generative parameters and ran the algorithms until convergence (1000 epochs, learning rate: 10−4, using the Adam optimiser; [19]). |