Training Generative Adversarial Networks with Limited Data

Authors: Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, Timo Aila

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate, on several datasets, that good results are now possible using only a few thousand training images, often matching Style GAN2 results with an order of magnitude fewer images. We also find that the widely used CIFAR-10 is, in fact, a limited data benchmark, and improve the record FID from 5.59 to 2.42.
Researcher Affiliation Collaboration Tero Karras NVIDIA tkarras@nvidia.com Miika Aittala NVIDIA maittala@nvidia.com Janne Hellsten NVIDIA jhellsten@nvidia.com Samuli Laine NVIDIA slaine@nvidia.com Jaakko Lehtinen NVIDIA and Aalto University jlehtinen@nvidia.com Timo Aila NVIDIA taila@nvidia.com
Pseudocode No The paper includes flowcharts in Figure 2, but no structured pseudocode or algorithm blocks.
Open Source Code Yes Our implementation and models are available at https://github.com/NVlabs/stylegan2-ada
Open Datasets Yes For our baseline, we considered Style GAN2 [19] and Big GAN [5, 30]. We measure quality by computing FID between 50k generated images and all available training images, as recommended by Heusel et al. [16], regardless of the subset actually used for training.
Dataset Splits No The paper mentions using a "separate validation set" and discusses its behavior for quantifying overfitting, but does not specify the exact split percentages or number of samples for this validation set.
Hardware Specification Yes but runs 4.6x faster on NVIDIA DGX-1.
Software Dependencies No The paper mentions software like StyleGAN2 and BigGAN but does not provide specific version numbers for underlying libraries or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We use 2 fewer feature maps, 2x larger minibatch, mixed-precision training for layers at 32^2, η = 0.0025, γ = 1, and exponential moving average half-life of 20k images for generator weights.