Training Generative Adversarial Networks with Limited Data
Authors: Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, Timo Aila
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate, on several datasets, that good results are now possible using only a few thousand training images, often matching Style GAN2 results with an order of magnitude fewer images. We also find that the widely used CIFAR-10 is, in fact, a limited data benchmark, and improve the record FID from 5.59 to 2.42. |
| Researcher Affiliation | Collaboration | Tero Karras NVIDIA tkarras@nvidia.com Miika Aittala NVIDIA maittala@nvidia.com Janne Hellsten NVIDIA jhellsten@nvidia.com Samuli Laine NVIDIA slaine@nvidia.com Jaakko Lehtinen NVIDIA and Aalto University jlehtinen@nvidia.com Timo Aila NVIDIA taila@nvidia.com |
| Pseudocode | No | The paper includes flowcharts in Figure 2, but no structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation and models are available at https://github.com/NVlabs/stylegan2-ada |
| Open Datasets | Yes | For our baseline, we considered Style GAN2 [19] and Big GAN [5, 30]. We measure quality by computing FID between 50k generated images and all available training images, as recommended by Heusel et al. [16], regardless of the subset actually used for training. |
| Dataset Splits | No | The paper mentions using a "separate validation set" and discusses its behavior for quantifying overfitting, but does not specify the exact split percentages or number of samples for this validation set. |
| Hardware Specification | Yes | but runs 4.6x faster on NVIDIA DGX-1. |
| Software Dependencies | No | The paper mentions software like StyleGAN2 and BigGAN but does not provide specific version numbers for underlying libraries or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We use 2 fewer feature maps, 2x larger minibatch, mixed-precision training for layers at 32^2, η = 0.0025, γ = 1, and exponential moving average half-life of 20k images for generator weights. |