On Self Modulation for Generative Adversarial Networks

Authors: Ting Chen, Mario Lucic, Neil Houlsby, Sylvain Gelly

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a large-scale empirical study we observe a relative decrease of 5% 35% in FID. We perform a large-scale study of self-modulation to demonstrate that this method yields robust improvements in a variety of settings.
Researcher Affiliation Collaboration Ting Chen University of California, Los Angeles tingchen@cs.ucla.edu Mario Lucic, Neil Houlsby, Sylvain Gelly Google Brain {lucic,neilhoulsby,sylvaingelly}@google.com Work done at Google.
Pseudocode No The paper describes the methodology using text and equations, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes 1Code at https://github.com/google/compare_gan
Open Datasets Yes We consider four datasets: CIFAR10, CELEBA-HQ, LSUN-BEDROOM, and IMAGENET. The LSUN-BEDROOM dataset (Yu et al., 2015) contains around 3M images. CELEBAHQ contains 30k images (Karras et al., 2017). CIFAR10 contains 70K images (32 32 3)... Finally, we evaluate our method on IMAGENET, which contains 1.3M training images and 50K test images.
Dataset Splits No We partition the images randomly into a test set containing 30588 images and a train set containing the rest. We use 3000 examples as the test set and the remaining examples as the training set. CIFAR10 contains 70K images (32 32 3), partitioned into 60000 training instances and 10000 testing instances. Finally, we evaluate our method on IMAGENET, which contains 1.3M training images and 50K test images. The paper does not explicitly describe a validation dataset split.
Hardware Specification Yes All models are trained with batch size of 64 on a single n Vidia P100 GPU.
Software Dependencies No The paper mentions the use of Adam optimizer, Inception classifier, and Inception Net, but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes We consider two loss functions. The first one is the non-saturating loss proposed in Goodfellow et al. (2014):... The second one is the hinge loss used in Miyato et al. (2018):... We consider two state-of-the-art techniques: gradient penalty (Gulrajani et al., 2017), and spectral normalization (Miyato et al., 2018). For the gradient penalty regularizer we consider regularization strength λ {1, 10}. We train all models for 100k generator steps with the Adam optimizer... We test two popular settings of the Adam hyperparameters (β1, β2): (0.5, 0.999) and (0, 0.9)... In total, this amounts to three different sets of hyperparameters for (β1, β2, disc iter): (0, 0.9, 1), (0, 0.9, 2), (0.5, 0.999, 1). We fix the learning rate to 0.0002... All models are trained with batch size of 64.