GILBO: One Metric to Measure Them All
Authors: Alexander A. Alemi, Ian Fischer
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compute the GILBO for 800 GANs and VAEs each trained on four datasets (MNIST, Fashion MNIST, CIFAR-10 and Celeb A) and discuss the results. |
| Researcher Affiliation | Industry | Alexander A. Alemi , Ian Fischer Google AI {alemi,iansf}@google.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | To that end, our implementation is available at https://github.com/google/compare_gan. |
| Open Datasets | Yes | We computed the GILBO for each of the 700 GANs and 100 VAEs tested in Lucic et al. (2017) on the MNIST, Fashion MNIST, CIFAR and Celeb A datasets in their wide range hyperparameter search. |
| Dataset Splits | No | The paper mentions using models from a prior study (Lucic et al. (2017)) which would have used dataset splits, but this paper does not explicitly provide the specific training, validation, or test dataset splits used for its own experiments in calculating GILBO. |
| Hardware Specification | No | The paper mentions training times for estimating GILBO, implying computational resources were used, but does not provide specific hardware details (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper mentions using 'ADAM' for optimization but does not provide specific version numbers for any software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | For our encoder network, we duplicated the discriminator, but adjusted the final output to be a linear layer predicting the 64 2 = 128 parameters defining a ( 1, 1) remapped Beta distribution (or Gaussian in the case of the VAE) over the latent space. We used a Beta since all of the GANs were trained with a ( 1, 1) 64-dimensional uniform distribution. The parameters of the encoder were optimized for up to 500k steps with ADAM (Kingma & Ba, 2015) using a scheduled multiplicative learning rate decay. We used the same batch size (64) as in the original training. |