Efficient Gradient-Based Inference through Transformations between Bayes Nets and Neural Nets
Authors: Diederik Kingma, Max Welling
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Theoretical results are supported by experiments. (Abstract), Experimental results are shown in section 6. (Section 1.1), We applied a Hybrid Monte Carlo (HMC) sampler to a Dynamic Bayesian Network (DBN)... (Section 6.1), trained on a small (1000 datapoints) and large (50000 datapoints) version of the MNIST dataset. (Section 6.2) |
| Researcher Affiliation | Academia | Diederik P. Kingma D.P.KINGMA@UVA.NL Max Welling M.WELLING@UVA.NL Machine Learning Group, University of Amsterdam |
| Pseudocode | No | No pseudocode or algorithm blocks found. |
| Open Source Code | No | No explicit statement or link for open-source code for the described methodology was found. |
| Open Datasets | Yes | The model was trained on a small (1000 datapoints) and large (50000 datapoints) version of the MNIST dataset. (Section 6.2) |
| Dataset Splits | No | The paper mentions training on MNIST but does not provide specific details on train/validation/test splits, percentages, or sample counts for these splits. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments are provided. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library names, frameworks, or solvers with their versions) are provided. |
| Experiment Setup | Yes | For HMC we used 10 leapfrog steps per sample, and the stepsize was automatically adjusted while sampling to obtain a HMC acceptance rate of around 0.9. At each sampling run, the first 1000 HMC samples were thrown away (burn-in); the subsequent 4000 HMC samples were kept. (Section 6.1), For MCEM, we used HMC with 10 leapfrog steps followed by a weight update using Adagrad (Duchi et al., 2010). For MMCL, we used L {10, 100, 500}. (Section 6.2) |