Information Theoretic lower bounds on negative log likelihood

Authors: Luis A. Lastras-Montaño

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we test whether the lower bound derived and the corresponding fundamental quantity c(z) are useful in practice when making modeling decisions by applying these ideas to a problem in image modeling for which there have been several recent results involving Variational Autoencoders. 4 EXPERIMENTAL VALIDATION
Researcher Affiliation Industry Luis A. Lastras-Monta no IBM Research AI Yorktown Heights, NY, 10598, USA lastrasl@us.ibm.com
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No Our experimental setup is an extension of the publicly available source code that the authors of the Vamp Prior article (Tomczak & Welling, 2018) used in their work. - This indicates they used existing open-source code, but not that their modifications or new code are open-source. There is no explicit statement of releasing their code.
Open Datasets Yes The data sets that we will use are image modeling data sets are Static MNIST (Larochelle & Murray, 2011), OMNIGLOT (Lake et al., 2015), Caltech 101 Silhouettes (Marlin et al., 2010), Frey Faces (Frey Faces), Histopathology (Tomczak & Welling, 2016) and CIFAR (Krizhevsky, 2009).
Dataset Splits Yes The methodology that we follow, borrowed from (Tomczak & Welling, 2018), involves training a model with checkpoints, which store the best model found from the standpoint of its performance on a validation data set. Following the MNIST data set structure, we created a data set with 50K training, 10K validation and 10K test images by sampling from the latent variable model.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory specifications. It only refers to general computing environments.
Software Dependencies No The paper mentions using 'Adam (Kingma & Ba, 2015) as the optimization algorithm' but does not specify a version number for Adam or any other software dependencies.
Experiment Setup Yes For the choice of priors, we use the standard zero mean, unit variance Gaussian prior as well as the variational mixture of posteriors prior parametric family from (Tomczak & Welling, 2018) with 500 pseudo inputs for all experiments. Our choices for the autoencoder architectures are a single stochastic layer Variational Autoencoder (VAE) with two hidden layers (300 units each), a two stochastic layer hierarchical Variational Autoencoder (HVAE)... In all cases the dimension of the latent vector is 40 for both the first and second stochastic layers. We use Adam (Kingma & Ba, 2015) as the optimization algorithm.