reproducibilityindex.ai

Hamiltonian Variational Auto-Encoder

Authors: Anthony L. Caterini, Arnaud Doucet, Dino Sejdinovic

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we discuss the experiments used to validate our method. We ﬁrst test HVAE on an example with a tractable full log likelihood (where no neural networks are needed), and then perform larger-scale tests on the MNIST dataset.
Researcher Affiliation	Academia	Anthony L. Caterini1, Arnaud Doucet1,2, Dino Sejdinovic1,2 1Department of Statistics, University of Oxford 2Alan Turing Institute for Data Science
Pseudocode	Yes	Algorithm 1 Hamiltonian ELBO, Fixed Tempering
Open Source Code	Yes	Code is available online.4
Open Datasets	Yes	The next experiment that we consider is using HVAE to improve upon a convolutional variational auto-encoder (VAE) for the binarized MNIST handwritten digit dataset. (...) We use the standard stochastic binarization of MNIST [24] as training data
Dataset Splits	Yes	We also employ early stopping by halting the training procedure if there is no improvement in the loss on validation data over 100 epochs.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments. It only mentions that models were trained using TensorFlow.
Software Dependencies	No	The paper states 'All models were trained using Tensor Flow [1]' but does not provide a specific version number for TensorFlow or any other software dependencies with version details.
Experiment Setup	Yes	All experiments have N = 10,000 and all training was done using RMSProp [27] with a learning rate of 10 3. (...) We train using Adamax [14] with learning rate 10 3. We also employ early stopping by halting the training procedure if there is no improvement in the loss on validation data over 100 epochs. (...) The inference network consists of three convolutional layers, each with ﬁlters of size 5 5 and a stride of 2. The convolutional layers output 16, 32, and 32 feature maps, respectively. The output of the third layer is fed into a fully-connected layer with hidden dimension nh = 450, whose output is then fully connected to the output means and standard deviations each of size . Softplus activation functions are used throughout the network except immediately before the outputted mean.