DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors

Authors: Arash Vahdat, Evgeny Andriyash, William Macready

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors.
Researcher Affiliation Industry Arash Vahdat , Evgeny Andriyash , William G. Macready Quadrant.ai, D-Wave Systems Inc. Burnaby, BC, Canada {arash,evgeny,bill}@quadrant.ai
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes An implementation which reproduces these results is available at https://github.com/Quadrant AI/dvae.
Open Datasets Yes Experiments on the MNIST [43] and OMNIGLOT [44] datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors.
Dataset Splits No The paper mentions using MNIST and OMNIGLOT datasets but does not explicitly provide specific train/validation/test split percentages, sample counts, or citations to predefined splits. It only mentions 'test set log-likelihoods' and cross-validation for hyperparameter tuning, not for data partitioning.
Hardware Specification No The paper mentions 'a GPU implementation of population annealing' but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies No The paper mentions 'Qu PA4, a GPU implementation of population annealing' and refers to an automatic differentiation (AD) library, but does not provide specific version numbers for these or other software dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup Yes We use the same initialization scheme, batch-size, optimizer, number of training iterations, schedule of learning rate, weight decay and KL warm-up for training that was used in [20] (See Sec. 7.2 in [20]). For the mean-field optimization, we use 5 iterations. To evaluate the trained models, we estimate the log-likelihood on the discrete graphical model using the importance-weighted bound with 4000 samples [21].