Large Scale GAN Training for High Fidelity Natural Image Synthesis

Authors: Andrew Brock, Jeff Donahue, Karen Simonyan

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental When trained on Image Net at 128 128 resolution, our models (Big GANs) achieve an Inception Score (IS) of 166.5 and Fr echet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.65.
Researcher Affiliation Collaboration Andrew Brock Heriot-Watt University ajb5@hw.ac.uk Jeff Donahue Deep Mind jeffdonahue@google.com Karen Simonyan Deep Mind simonyan@google.com
Pseudocode No The paper provides architectural diagrams and tables describing network layers, but no pseudocode or algorithm blocks are present.
Open Source Code Yes Code and weights for our pretrained generators are publicly available 1. https://tfhub.dev/s?q=biggan
Open Datasets Yes We evaluate our models on Image Net ILSVRC 2012 (Russakovsky et al., 2015) at 128 128, 256 256, and 512 512 resolutions, employing the settings from Table 1, row 8. [...] To confirm that our design choices are effective for even larger and more complex and diverse datasets, we also present results of our system on a subset of JFT-300M (Sun et al., 2017).
Dataset Splits Yes We compute the IS for both the training and validation sets of Image Net. At 128 128 the training data has an IS of 233, and the validation data has an IS of 166. At 256 256 the training data has an IS of 377, and the validation data has an IS of 234. At 512 512 the training data has an IS of 348, and the validation data has an IS of 241.
Hardware Specification Yes Each model is trained on 128 to 512 cores of a Google TPUv3 Pod (Google, 2018)
Software Dependencies No The paper states 'implemented in Tensor Flow (Abadi et al., 2016)' but does not provide specific version numbers for TensorFlow or any other key software libraries used.
Experiment Setup Yes For Big GAN models at all resolutions, we use 2 10 4 in D and 5 10 5 in G. For Big GAN-deep, we use the learning rate of 2 10 4 in D and 5 10 5 in G for 128 128 models, and 2.5 10 5 in both D and G for 256 256 and 512 512 models. We experimented with the number of D steps per G step (varying it from 1 to 6) and found that two D steps per G step gave the best results.