Projected GANs Converge Faster
Authors: Axel Sauer, Kashyap Chitta, Jens Müller, Andreas Geiger
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on small and large datasets with a resolution of up to 10242 pixels. Across all datasets, we demonstrate state-of-the-art image synthesis results at significantly reduced training time (Fig. 1). We also find that Projected GANs increase data efficiency and avoid the need for additional regularization, rendering expensive hyperparameter sweeps unnecessary. |
| Researcher Affiliation | Academia | 1University of Tübingen 2Max Planck Institute for Intelligent Systems, Tübingen 3Computer Vision and Learning Lab, University Heidelberg |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code, models, and supplementary videos can be found on the project page https://sites.google.com/view/projected-gan. |
| Open Datasets | Yes | We conduct experiments on LSUN-Church [67], which is medium-sized (126k images) and reasonably visually complex, using a resolution of 2562 pixels. We also created two subsets of the 70k CLEVR dataset [22] by randomly subsampling 10k and 1k images from it, respectively. Besides CLEVR and LSUN-Church, we benchmark Projected GANs against various state-of-the-art models on three other large datasets: LSUN-Bedroom [67] (3M indoor bedroom scenes), FFHQ [26] (70k images of faces) and Cityscapes [6] (25k driving scenes captured from a vehicle). We compare against Style GAN2-ADA and Fast GAN on art paintings from Wiki Art (1000 images; wikiart.org), Oxford Flowers (1360 images) [42], photographs of landscapes (4319 images; flickr.com), Animal Face-Dog (389 images) [57] and Pokemon (833 images; pokemon.com). Lastly, we evaluate on AFHQ-Cat, -Dog and -Wild at 5122 [5]. The AFHQ datasets contain 5k closeups per category cat, dog, or wildlife. We do not have a license to re-distribute these datasets, but we provide the URLs to enable reproducibility, similar to [35]. |
| Dataset Splits | No | The paper mentions training until '1 million real images have been shown to the discriminator' and evaluating with '50k generated and all real images' for FID. However, it does not provide specific percentages or counts for training, validation, and test splits for model development and evaluation. |
| Hardware Specification | Yes | With this setting, each experiment takes roughly 100-200 GPU hours on a NVIDIA V100, for more details we refer to the appendix. |
| Software Dependencies | No | The paper mentions implementing baselines 'within the codebase provided by the authors of Style GAN2-ADA [25]' but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For the generator G we use the generator architecture of Fast GAN [35], consisting of several upsampling blocks, with additional skip-layerexcitation blocks. Using a hinge loss [33], we train with a batch size of 64 until 1 million real images have been shown to the discriminator... Projected GANs use the same generator and discriminator architecture and training hyperparameters (learning rate and batch size) for all experiments. |