DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors
Authors: Arash Vahdat, Evgeny Andriyash, William Macready
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors. |
| Researcher Affiliation | Industry | Arash Vahdat , Evgeny Andriyash , William G. Macready Quadrant.ai, D-Wave Systems Inc. Burnaby, BC, Canada {arash,evgeny,bill}@quadrant.ai |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | An implementation which reproduces these results is available at https://github.com/Quadrant AI/dvae. |
| Open Datasets | Yes | Experiments on the MNIST [43] and OMNIGLOT [44] datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors. |
| Dataset Splits | No | The paper mentions using MNIST and OMNIGLOT datasets but does not explicitly provide specific train/validation/test split percentages, sample counts, or citations to predefined splits. It only mentions 'test set log-likelihoods' and cross-validation for hyperparameter tuning, not for data partitioning. |
| Hardware Specification | No | The paper mentions 'a GPU implementation of population annealing' but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions 'Qu PA4, a GPU implementation of population annealing' and refers to an automatic differentiation (AD) library, but does not provide specific version numbers for these or other software dependencies like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | We use the same initialization scheme, batch-size, optimizer, number of training iterations, schedule of learning rate, weight decay and KL warm-up for training that was used in [20] (See Sec. 7.2 in [20]). For the mean-field optimization, we use 5 iterations. To evaluate the trained models, we estimate the log-likelihood on the discrete graphical model using the importance-weighted bound with 4000 samples [21]. |