Refining Deep Generative Models via Discriminator Gradient Flow
Authors: Abdul Fatir Ansari, Ming Liang Ang, Harold Soh
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on multiple synthetic, image, and text datasets demonstrate that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) and Discriminator Driven Latent Sampling (DDLS) methods. |
| Researcher Affiliation | Academia | Abdul Fatir Ansari, Ming Liang Ang & Harold Soh Department of Computer Science, School of Computing National University of Singapore {abdulfatir, angmingliang}@u.nus.edu, harold@comp.nus.edu.sg |
| Pseudocode | Yes | Algorithm 1 Refinement in the Latent Space using DGflow. Require: First derivative of f (f ), generator (gθ), discriminator (dφ), number of update steps (N), stepsize (η), noise factor (γ). 1: z0 p Z(z) Sample from the prior. 2: for i 0, N do 3: ξi N(0, I) 4: zi+1 = zi η zif (e dφ(gθ(zi))) + 2ηγξi 5: end for 6: return gθ(zn) The refined sample. |
| Open Source Code | Yes | Our code is available online at https://github.com/clear-nus/DGflow. |
| Open Datasets | Yes | Empirical results on multiple synthetic, image, and text datasets demonstrate that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) and Discriminator Driven Latent Sampling (DDLS) methods. ... CIFAR10 (Krizhevsky et al., 2009) is a dataset of 60K natural RGB images of size 32 32 from 10 classes. STL10 is a dataset of 100K natural RGB images of size 96 96 from 10 classes. We resized the STL10 (Coates et al., 2011) dataset to 48 48 for SNGAN and WGANGP, and to 32 32 for MMDGAN, OCFGAN-GP, and VAE since the respective base models were trained on these sizes. ... We used the Billion Words dataset (Chelba et al., 2013) which was pre-processed into 32-character long strings. |
| Dataset Splits | No | The 25 Gaussians dataset was constructed by generating 100000 samples from a mixture of 25 equally likely 2D isotropic Gaussians with means { 4, 2, 0, 2, 4} { 4, 2, 0, 2, 4} R2 and standard deviation 0.05. ... The 2DSwissroll dataset was constructed by first generating 100000 samples... The models were trained for 10K generator iterations with a batch size of 256... We used the entire training and test set (60K images) for CIFAR10 and the entire unlabeled set (100K images) for STL10 as the set of real images used to compute FID. |
| Hardware Specification | Yes | Runtime of DGflow(KL) for models that do not require density-ratio correction on a single GeForce RTX 2080 Ti GPU. |
| Software Dependencies | No | KDE was performed using sklearn.neighbors.Kernel Density with a Gaussian kernel and a kernel bandwidth of 0.1. |
| Experiment Setup | Yes | We trained a WGAN-GP model for both the datasets. The generator was a fully-connected network with Re LU non-linearities that mapped z N(0, I2 2) to x R2. Similarly, the discriminator was a fully-connected network with Re LU non-linearities that mapped x R2 to R. We refer the reader to Gulrajani et al. (2017) for the exact network structures. The gradient penalty factor was set to 10. The models were trained for 10K generator iterations with a batch size of 256 using the Adam optimizer with a learning rate of 10 4, β1 = 0.5, and β1 = 0.9. We updated the discriminator 5 times for each generator iteration. |