A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

Authors: Hugo Berard, Gauthier Gidel, Amjad Almahairi, Pascal Vincent, Simon Lacoste-Julien

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To answer this question we conducted extensive experiments by training different GAN formulations (NSGAN and WGAN-GP) with different optimizers (Adam and Extra Adam) on three datasets (Mo G, MNIST and CIFAR10). Based on our experiments and using our visualization techniques we observe that the landscape of GANs is fundamentally different from the standard loss surfaces of deep networks. Furthermore, we provide evidence that existing GAN training methods do not converge to a local Nash equilibrium.
Researcher Affiliation Collaboration Hugo Berard Mila, Universit e de Montr eal Facebook AI Research Gauthier Gidel Mila, Universit e de Montr eal Element AI Amjad Almahairi Element AI Pascal Vincent Mila, Universit e de Montr eal Facebook AI Research Simon Lacoste-Julien Mila, Universit e de Montr eal Element AI
Pseudocode No The paper describes mathematical formulations and processes but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes 1Code available at https://bit.ly/2kwTu87
Open Datasets Yes Datasets. We first propose to train a GAN on a toy task composed of a 1D mixture of 2 Gaussians (Mo G) with 10,000 samples... We also train a GAN on MNIST... Finally we also look at the optimization landscape of a state of the art Res Net on CIFAR10 (Krizhevsky and Hinton, 2009). We use the training part of MNIST dataset Le Cun et al. (2010) (50K examples) for training our models, and scale each image to the range [ 1, 1].
Dataset Splits No The paper mentions using the 'training part of MNIST dataset (50K examples)' but does not specify any explicit training, validation, or test splits with percentages or sample counts for reproduction.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions using the 'Scipy library' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Hyperparameters for WGAN-GP on Mo G Batch size = 10, 000 (Full-Batch) Number of iterations = 30, 000 Learning rate for generator = 1 10 2 Learning rate for discriminator = 1 10 1 Gradient Penalty coefficient = 1 10 3