Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders
Authors: Adeel Pervez, Efstratios Gavves
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform an extensive array of evaluations with stateof-the-art benchmarks, methods, and architectures. Unless stated otherwise, we use the same architectures in the respective comparisons for VAE with and without OU-smoothing. Our primary focus is posterior collapse, a fundamental problem, linked to high variance when stacking multiple stochastic layers (Lucas et al., 2019a). We investigate OUsmoothed VAEs in the context of posterior collapse and compare with methods that help with it. Then, we evaluate OU-smoothed VAEs on binary MNIST, OMNIGLOT and CIFAR-10 with various convolutional and MLP architectures. We compute validation ELBOs with importanceweighted samples (Burda et al., 2016) (L100 with 100 and L5000 with 5000 samples). |
| Researcher Affiliation | Academia | 1QUVA Lab, Informatics Institute, University of Amsterdam, The Netherlands. |
| Pseudocode | No | The paper describes the algorithmic steps in narrative text (e.g., 'Algorithm. In essence, OU-smoothed variational autoencoders are similar to variational autoencoders...'), but it does not include a formal pseudocode block or a clearly labeled algorithm section. |
| Open Source Code | No | The paper does not provide any specific links to a code repository, an explicit statement that the source code is available, or mention of code in supplementary materials. |
| Open Datasets | Yes | Then, we evaluate OU-smoothed VAEs on binary MNIST, OMNIGLOT and CIFAR-10 with various convolutional and MLP architectures. |
| Dataset Splits | No | The paper uses well-known datasets like MNIST, OMNIGLOT, and CIFAR-10, and refers to 'validation ELBO' and 'test ELBO' (e.g., 'We compute validation ELBOs with importance-weighted samples'). However, it does not explicitly state the specific dataset split percentages (e.g., 80/10/10) or sample counts for training, validation, and test sets. It implies standard splits for these public datasets without explicitly detailing them in the text. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory specifications, or cloud computing instance types used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers (e.g., Python, PyTorch, TensorFlow versions, or CUDA versions), that would be necessary to fully replicate the experimental environment. |
| Experiment Setup | Yes | For KL annealing the annealing coefficient is set to 0 for the first 10,000 steps and is linearly annealed to 1 over the next 500,000 steps. For free bits, we apply the same free bits value to each stochastic layer, as recommended in IAF (Kingma et al., 2016). The free bits values are chosen from {0.5, 1.0, 2.0, 3.0}. Experiments show that 5 to 10 samples suffice. The memory usage is the maximum amount of used memory (batch size of 64). |