On Characterizing GAN Convergence Through Proximal Duality Gap

Authors: Sahil Sidheekh, Aroof Aimen, Narayanan C Krishnan

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we validate experimentally the usefulness of proximal duality gap for monitoring and influencing GAN training.5. Experimentation To experimentally establish the proficiency of DGλ, we consider a WGAN with weight-clipping (that optimizes Vw) (Arjovsky et al., 2017) and a Spectral Normalized GAN (SNGAN) (that optimizes Vc) (Miyato et al., 2018) over 3 datasets MNIST (Deng, 2012), CIFAR-10 (Krizhevsky et al., 2014) and CELEB-A (Liu et al., 2015).
Researcher Affiliation Academia 1Department of Computer Science, Indian Institute of Technology, Ropar, India. Correspondence to: Sahil Sidheekh <2017csb1104@iitrpr.ac.in>, Narayanan C Krishnan <ckn@iitrpr.ac.in>.
Pseudocode No The algorithm for the overall estimation process and the associated computational complexity are discussed in the supplementary material.
Open Source Code Yes Further implementation details for each experiment are provided in the supplementary material and the source code is publicly available 1. 1https://github.com/proximal-dg/proximal_ dg
Open Datasets Yes We consider a WGAN with weight-clipping... and a Spectral Normalized GAN (SNGAN)... over 3 datasets MNIST (Deng, 2012), CIFAR-10 (Krizhevsky et al., 2014) and CELEB-A (Liu et al., 2015).
Dataset Splits No The paper mentions training on specific datasets (MNIST, CIFAR-10, CELEB-A) but does not provide explicit training, validation, and test splits (e.g., percentages or sample counts) or references to predefined splits.
Hardware Specification No The paper does not specify any particular hardware components (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies No We used the torchgan framework (Pal & Das, 2019) to train and evaluate all GAN models.
Experiment Setup Yes For all the experiments, we use the 4-layer DCGAN (Radford et al., 2016) architecture for both the generator and the discriminator networks, and an Adam optimizer (Kingma & Ba, 2015) to train the models. To compute DGλ, we use λ=0.1 and 20 optimization steps for approximating the proximal objective.We train WGAN over the MNIST dataset by performing a grid search over N in the range 10 to 10