Hierarchical Decompositional Mixtures of Variational Autoencoders
Authors: Ping Liang Tan, Robert Peharz
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments we show that our models outperform classical VAEs on almost all of our experimental benchmarks. Moreover, we show that our model is highly data efficient and degrades very gracefully in extremely low data regimes. We compare SPVAEs with classical VAEs on the MNIST, CIFAR-10, and SVHN image datasets. For each dataset, we consider two versions: i) interpreting images as continuous signals and using Gaussians for the VAE outputs, and ii) interpreting images as discrete data on 0 . . . 255 and using Binomial distributions as VAE outputs. On all of these 6 benchmarks, SPVAEs clearly outperform classical VAEs in terms of test likelihood (estimated with 5000 importance-weighted samples). At the same time, due to their decompositional nature, SPVAE models are almost an order of magnitude smaller than VAEs. Moreover, we show that SPVAEs are more data efficient than VAEs: on all benchmarks we can reduce the amount of training data down to 10%, without significantly deteriorating the test performance. Even for extremely low data regimes, SPVAEs degrade much more gracefully than VAEs. Table 1. Performance on test set, 5000-sample IWAE ELBO. Figure 2. Degradation of test ELBO as training set size is reduced. |
| Researcher Affiliation | Collaboration | 1Department of Engineering, University of Cambridge, UK 2DSO National Laboratories, Singapore. Correspondence to: Ping Liang Tan <plt28j@gmail.com>, Robert Peharz <rp587@cam.ac.uk>. |
| Pseudocode | Yes | Algorithm 1 Stochastic Variational EM for SPVAE |
| Open Source Code | Yes | Code available under https://github.com/ cambridge-mlg/SPVAE. |
| Open Datasets | Yes | We compare SPVAEs with classical VAEs on the MNIST (Le Cun et al.), CIFAR-10 (Krizhevsky, 2009), and SVHN (Netzer et al., 2011) image datasets. |
| Dataset Splits | Yes | For MNIST, we randomly selected 10k images from the training set for the validation set; for CIFAR-10, the 60k images were randomly divided into sets of sizes 40k / 10k / 10k, which were used as training, validation, and test sets. For SVHN, we used the first 26032 images from the extra set as validation set (i.e. of the same size as the test set). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions general computing environments like "Tensorflow". |
| Software Dependencies | No | We implemented all models in Tensorflow (Abadi M. et al., 2015) and used Adam (Kingma & Ba, 2015) with its default parameters for optimizing the respective ELBOs. The paper mentions TensorFlow and Adam optimizer but does not provide specific version numbers for these or any other software components. |
| Experiment Setup | Yes | During training, we consistently used 5 importance weighted samples for ELBO estimates (2). We used a batch size of 128 throughout all our experiments. The quality of density estimation in VAEs and SPVAEs depends both on the model size and the dimensionality of the latent codes. Thus, we treated the number of hidden units H per neural network layer and the dimensionality nz of latent VAE codes as hyper-parameters, and cross-validated them on a validation set. For MNIST, we randomly selected 10k images from the training set for the validation set; for CIFAR-10, the 60k images were randomly divided into sets of sizes 40k / 10k / 10k, which were used as training, validation, and test sets. For SVHN, we used the first 26032 images from the extra set as validation set (i.e. of the same size as the test set). The same H was used for each layer (decoder and encoder), and in the case of SPVAEs, for each VAE leaf. In order to keep the sizes of the overall models comparable, we used ranges nz {1, 2, 5, 25, 50, 100, 150, 200}, H {30, 100, 300, 600} for VAEs, and nz {1, 2, 5, 25, 50}, H {8, 16, 32} for SPVAEs. No regularization was applied, but we used early stopping in order to prevent overfitting. In particular, we evaluated the training progress on the validation set every 128 batches and stop training if the performance on the validation set decreased five times consecutively. |