Hierarchical Importance Weighted Autoencoders
Authors: Chin-Wei Huang, Kris Sankaran, Eeshan Dhekane, Alexandre Lacoste, Aaron Courville
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we confirm that maximization of the lower bound does implicitly minimize variance. Further analysis shows that this is a result of negative correlation induced by the proposed hierarchical meta sampling scheme, and performance of inference also improves when the number of samples increases. 6. Experiments We first demonstrate the effect of sharing a common random number on the dependency among the multiple proposals, and then apply the amortized version of hierarchical importance sampling (that is, H-IWAE) to learning a deep latent Gaussian models. |
| Researcher Affiliation | Collaboration | 1Mila, University of Montreal 2Element AI 3CIFAR member. Correspondence to: Chin-Wei Huang <chinwei.huang@umontreal.ca>. |
| Pseudocode | No | Explanation for No: The paper describes its methods and derivations in mathematical formulations and prose but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Explanation for No: The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | Our final experiment was to apply hierarchical proposals to learning variational autoencoders on a set of standard datasets, including binarized MNIST (Larochelle & Murray, 2011), binarized OMNIGLOT (Lake et al., 2015) and Caltech101 Silhouettes (Marlin et al., 2010) |
| Dataset Splits | No | Results on the binarized MNIST and OMNIGLOT are in Table 1 and 2, respectively. ... Ltr, Lva, Lte stands for the lower bound on log likelihood of the dataset (training, validation and test). |
| Hardware Specification | No | Explanation for No: The paper does not provide any specific details about the hardware used for running the experiments, such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | Explanation for No: The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) that would be needed to replicate the experimental setup. |
| Experiment Setup | Yes | The hyperparameters are fixed as follows: minibatch size 64, learning rate 5 10 5, linear annealing schedule with 50,000 iterations for the log density terms except p(x|z) (i.e. KL between q(z|x) and p(z) for VAE), polyak averaging with exponential averaging coefficient 0.998. |