reproducibilityindex.ai

Improved Training of Wasserstein GANs

Authors: Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron C. Courville

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer Res Nets and language models with continuous generators. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms. Section 5 Experiments
Researcher Affiliation	Academia	Ishaan Gulrajani1 , Faruk Ahmed1, Martin Arjovsky2, Vincent Dumoulin1, Aaron Courville1,3 1 Montreal Institute for Learning Algorithms 2 Courant Institute of Mathematical Sciences 3 CIFAR Fellow igul222@gmail.com {faruk.ahmed,vincent.dumoulin,aaron.courville}@umontreal.ca ma4371@nyu.edu
Pseudocode	Yes	Algorithm 1 WGAN with gradient penalty.
Open Source Code	Yes	Code for our models is available at https://github.com/igul222/improved wgan training.
Open Datasets	Yes	From this set, we sample 200 architectures and train each on 32 32 Image Net with both WGAN-GP and the standard GAN objectives. ... train six different GAN architectures on the LSUN bedrooms dataset [30]. ... train WGANs with weight clipping and our gradient penalty on CIFAR-10 [13] ... we train a character-level GAN language model on the Google Billion Word dataset [6].
Dataset Splits	Yes	To explore the loss curve s behavior when the network overﬁts, we train large unregularized WGANs on a random 1000-image subset of MNIST and plot the negative critic loss on both the training and validation sets in Figure 5b.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using specific optimizers like Adam and RMSProp, and normalization schemes like Layer Normalization. However, it does not provide specific version numbers for any software, libraries, or frameworks used.
Experiment Setup	Yes	Algorithm 1 WGAN with gradient penalty. We use default values of λ = 10, ncritic = 5, = 0.0001, β1 = 0, β2 = 0.9. ... Table 1: We evaluate WGAN-GP s ability to train the architectures in this set. Nonlinearity (G) [Re LU, Leaky Re LU, softplus(2x+2) 2 1, tanh] Nonlinearity (D) [Re LU, Leaky Re LU, softplus(2x+2) 2 1, tanh] Depth (G) [4, 8, 12, 20] Depth (D) [4, 8, 12, 20] Batch norm (G) [True, False] Batch norm (D; layer norm for WGAN-GP) [True, False] Base ﬁlter count (G) [32, 64, 128] Base ﬁlter count (D) [32, 64, 128]