Refining Deep Generative Models via Discriminator Gradient Flow

Authors: Abdul Fatir Ansari, Ming Liang Ang, Harold Soh

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results on multiple synthetic, image, and text datasets demonstrate that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) and Discriminator Driven Latent Sampling (DDLS) methods.
Researcher Affiliation Academia Abdul Fatir Ansari, Ming Liang Ang & Harold Soh Department of Computer Science, School of Computing National University of Singapore {abdulfatir, angmingliang}@u.nus.edu, harold@comp.nus.edu.sg
Pseudocode Yes Algorithm 1 Refinement in the Latent Space using DGflow. Require: First derivative of f (f ), generator (gθ), discriminator (dφ), number of update steps (N), stepsize (η), noise factor (γ). 1: z0 p Z(z) Sample from the prior. 2: for i 0, N do 3: ξi N(0, I) 4: zi+1 = zi η zif (e dφ(gθ(zi))) + 2ηγξi 5: end for 6: return gθ(zn) The refined sample.
Open Source Code Yes Our code is available online at https://github.com/clear-nus/DGflow.
Open Datasets Yes Empirical results on multiple synthetic, image, and text datasets demonstrate that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) and Discriminator Driven Latent Sampling (DDLS) methods. ... CIFAR10 (Krizhevsky et al., 2009) is a dataset of 60K natural RGB images of size 32 32 from 10 classes. STL10 is a dataset of 100K natural RGB images of size 96 96 from 10 classes. We resized the STL10 (Coates et al., 2011) dataset to 48 48 for SNGAN and WGANGP, and to 32 32 for MMDGAN, OCFGAN-GP, and VAE since the respective base models were trained on these sizes. ... We used the Billion Words dataset (Chelba et al., 2013) which was pre-processed into 32-character long strings.
Dataset Splits No The 25 Gaussians dataset was constructed by generating 100000 samples from a mixture of 25 equally likely 2D isotropic Gaussians with means { 4, 2, 0, 2, 4} { 4, 2, 0, 2, 4} R2 and standard deviation 0.05. ... The 2DSwissroll dataset was constructed by first generating 100000 samples... The models were trained for 10K generator iterations with a batch size of 256... We used the entire training and test set (60K images) for CIFAR10 and the entire unlabeled set (100K images) for STL10 as the set of real images used to compute FID.
Hardware Specification Yes Runtime of DGflow(KL) for models that do not require density-ratio correction on a single GeForce RTX 2080 Ti GPU.
Software Dependencies No KDE was performed using sklearn.neighbors.Kernel Density with a Gaussian kernel and a kernel bandwidth of 0.1.
Experiment Setup Yes We trained a WGAN-GP model for both the datasets. The generator was a fully-connected network with Re LU non-linearities that mapped z N(0, I2 2) to x R2. Similarly, the discriminator was a fully-connected network with Re LU non-linearities that mapped x R2 to R. We refer the reader to Gulrajani et al. (2017) for the exact network structures. The gradient penalty factor was set to 10. The models were trained for 10K generator iterations with a batch size of 256 using the Adam optimizer with a learning rate of 10 4, β1 = 0.5, and β1 = 0.9. We updated the discriminator 5 times for each generator iteration.