Understanding Over-parameterization in Generative Adversarial Networks

Authors: Yogesh Balaji, Mohammadmahdi Sajedi, Neha Mukund Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we present a comprehensive analysis of the importance of model overparameterization in GANs both theoretically and empirically. We theoretically show that in an overparameterized GAN model with a 1-layer neural network generator and a linear discriminator, GDA converges to a global saddle point of the underlying non-convex concave min-max problem. We also empirically study the role of model overparameterization in GANs using several large-scale experiments on CIFAR-10 and Celeb-A datasets.
Researcher Affiliation Academia Yogesh Balaji1 , Mohammadmahdi Sajedi2 , Neha Mukund Kalibhat1, Mucong Ding1, Dominik St oger2, Mahdi Soltanolkotabi2, Soheil Feizi1 1 University of Maryland, College Park, MD 2 University of Southern California, Los Angeles, CA
Pseudocode No The paper does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the methodology described.
Open Datasets Yes In this section, we demonstrate benefits of overparamterization in large GAN models. In particular, we train GANs on two benchmark datasets: CIFAR-10 (32 32 resolution) and Celeb-A (64 64 resolution).
Dataset Splits No The paper mentions using a "held-out validation set" for FID scores but does not specify the exact percentages or sample counts for training/validation/test splits needed to reproduce the data partitioning for model training.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using "Adam" as an optimizer but does not specify version numbers for any software components, libraries, or frameworks used in the experiments.
Experiment Setup Yes Both DCGAN and Resnet-based GAN models are optimized using the commonly used hyper-parameters: Adam with learning rate 0.0001 and betas (0.5, 0.999) for DCGAN, gradient penalty of 10 and 5 critic iterations per generator s iteration for both DCGAN and Resnet-based GAN models. Models are trained for 300, 000 iterations with a batch size of 64.