Information Theoretic lower bounds on negative log likelihood
Authors: Luis A. Lastras-Montaño
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we test whether the lower bound derived and the corresponding fundamental quantity c(z) are useful in practice when making modeling decisions by applying these ideas to a problem in image modeling for which there have been several recent results involving Variational Autoencoders. 4 EXPERIMENTAL VALIDATION |
| Researcher Affiliation | Industry | Luis A. Lastras-Monta no IBM Research AI Yorktown Heights, NY, 10598, USA lastrasl@us.ibm.com |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | Our experimental setup is an extension of the publicly available source code that the authors of the Vamp Prior article (Tomczak & Welling, 2018) used in their work. - This indicates they used existing open-source code, but not that their modifications or new code are open-source. There is no explicit statement of releasing their code. |
| Open Datasets | Yes | The data sets that we will use are image modeling data sets are Static MNIST (Larochelle & Murray, 2011), OMNIGLOT (Lake et al., 2015), Caltech 101 Silhouettes (Marlin et al., 2010), Frey Faces (Frey Faces), Histopathology (Tomczak & Welling, 2016) and CIFAR (Krizhevsky, 2009). |
| Dataset Splits | Yes | The methodology that we follow, borrowed from (Tomczak & Welling, 2018), involves training a model with checkpoints, which store the best model found from the standpoint of its performance on a validation data set. Following the MNIST data set structure, we created a data set with 50K training, 10K validation and 10K test images by sampling from the latent variable model. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory specifications. It only refers to general computing environments. |
| Software Dependencies | No | The paper mentions using 'Adam (Kingma & Ba, 2015) as the optimization algorithm' but does not specify a version number for Adam or any other software dependencies. |
| Experiment Setup | Yes | For the choice of priors, we use the standard zero mean, unit variance Gaussian prior as well as the variational mixture of posteriors prior parametric family from (Tomczak & Welling, 2018) with 500 pseudo inputs for all experiments. Our choices for the autoencoder architectures are a single stochastic layer Variational Autoencoder (VAE) with two hidden layers (300 units each), a two stochastic layer hierarchical Variational Autoencoder (HVAE)... In all cases the dimension of the latent vector is 40 for both the first and second stochastic layers. We use Adam (Kingma & Ba, 2015) as the optimization algorithm. |