GILBO: One Metric to Measure Them All

Authors: Alexander A. Alemi, Ian Fischer

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compute the GILBO for 800 GANs and VAEs each trained on four datasets (MNIST, Fashion MNIST, CIFAR-10 and Celeb A) and discuss the results.
Researcher Affiliation Industry Alexander A. Alemi , Ian Fischer Google AI {alemi,iansf}@google.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes To that end, our implementation is available at https://github.com/google/compare_gan.
Open Datasets Yes We computed the GILBO for each of the 700 GANs and 100 VAEs tested in Lucic et al. (2017) on the MNIST, Fashion MNIST, CIFAR and Celeb A datasets in their wide range hyperparameter search.
Dataset Splits No The paper mentions using models from a prior study (Lucic et al. (2017)) which would have used dataset splits, but this paper does not explicitly provide the specific training, validation, or test dataset splits used for its own experiments in calculating GILBO.
Hardware Specification No The paper mentions training times for estimating GILBO, implying computational resources were used, but does not provide specific hardware details (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper mentions using 'ADAM' for optimization but does not provide specific version numbers for any software libraries or dependencies used in the experiments.
Experiment Setup Yes For our encoder network, we duplicated the discriminator, but adjusted the final output to be a linear layer predicting the 64 2 = 128 parameters defining a ( 1, 1) remapped Beta distribution (or Gaussian in the case of the VAE) over the latent space. We used a Beta since all of the GANs were trained with a ( 1, 1) 64-dimensional uniform distribution. The parameters of the encoder were optimized for up to 500k steps with ADAM (Kingma & Ba, 2015) using a scheduled multiplicative learning rate decay. We used the same batch size (64) as in the original training.