Hierarchical Importance Weighted Autoencoders

Authors: Chin-Wei Huang, Kris Sankaran, Eeshan Dhekane, Alexandre Lacoste, Aaron Courville

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we confirm that maximization of the lower bound does implicitly minimize variance. Further analysis shows that this is a result of negative correlation induced by the proposed hierarchical meta sampling scheme, and performance of inference also improves when the number of samples increases. 6. Experiments We first demonstrate the effect of sharing a common random number on the dependency among the multiple proposals, and then apply the amortized version of hierarchical importance sampling (that is, H-IWAE) to learning a deep latent Gaussian models.
Researcher Affiliation Collaboration 1Mila, University of Montreal 2Element AI 3CIFAR member. Correspondence to: Chin-Wei Huang <chinwei.huang@umontreal.ca>.
Pseudocode No Explanation for No: The paper describes its methods and derivations in mathematical formulations and prose but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No Explanation for No: The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes Our final experiment was to apply hierarchical proposals to learning variational autoencoders on a set of standard datasets, including binarized MNIST (Larochelle & Murray, 2011), binarized OMNIGLOT (Lake et al., 2015) and Caltech101 Silhouettes (Marlin et al., 2010)
Dataset Splits No Results on the binarized MNIST and OMNIGLOT are in Table 1 and 2, respectively. ... Ltr, Lva, Lte stands for the lower bound on log likelihood of the dataset (training, validation and test).
Hardware Specification No Explanation for No: The paper does not provide any specific details about the hardware used for running the experiments, such as GPU or CPU models, or memory specifications.
Software Dependencies No Explanation for No: The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experimental setup.
Experiment Setup Yes The hyperparameters are fixed as follows: minibatch size 64, learning rate 5 10 5, linear annealing schedule with 50,000 iterations for the log density terms except p(x|z) (i.e. KL between q(z|x) and p(z) for VAE), polyak averaging with exponential averaging coefficient 0.998.