Generative Models of Visually Grounded Imagination

Authors: Ramakrishna Vedantam, Ian Fischer, Jonathan Huang, Kevin Murphy

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we perform a detailed comparison of our method with two existing joint image-attribute VAE methods (the JMVAE method of Suzuki et al. (2017) and the Bi VCCA method of Wang et al. (2016b)) by applying them to two datasets: the MNIST-with-attributes dataset (which we introduce here), and the Celeb A dataset (Liu et al., 2015). Section 5 reports experimental results on two different datasets.
Researcher Affiliation Collaboration Ramakrishna Vedantam Georgia Tech vrama@gatech.edu Ian Fischer Google Inc. iansf@google.com Jonathan Huang Google Inc. jonathanhuang@google.com Kevin Murphy Google Inc. kpmurphy@google.com
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets Yes Finally, we perform a detailed comparison of our method with two existing joint image-attribute VAE methods... by applying them to two datasets: the MNIST-with-attributes dataset (which we introduce here), and the Celeb A dataset (Liu et al., 2015).
Dataset Splits Yes We split the images into a train, val and test set of 85%, 5%, and 10% of the data respectively to create the IID split. We choose the hyperparameters for each method so as to maximize JS-overall, which is an overall measure of correctness and coverage (see Section 3) on a validation set of attribute queries.
Hardware Specification Yes Our models typically take around a day to train on NVIDIA Titan X GPUs.
Software Dependencies No The paper mentions software like "Adam" and "DCGAN architecture" but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup Yes We use Adam (Kingma & Ba, 2015) for optimization, with a learning rate of 0.0001, and a minibatch size of 64. We train all models for 250,000 steps (we generally found that the models do not tend to overfit in our experiments). We use d = 10 latent dimensions for all models.