Improved Precision and Recall Metric for Assessing Generative Models
Authors: Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, Timo Aila
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our metric in Style GAN and Big GAN by providing several illustrative examples where existing metrics yield uninformative or contradictory results. Furthermore, we analyze multiple design variants of Style GAN to better understand the relationships between the model architecture, training methods, and the properties of the resulting sample distribution. We demonstrate the effectiveness of our metric using two recent generative models (Section 3), Style GAN [12] and Big GAN [4]. |
| Researcher Affiliation | Collaboration | Tuomas Kynkäänniemi Aalto University NVIDIA tuomas.kynkaanniemi@aalto.fi Tero Karras NVIDIA tkarras@nvidia.com Samuli Laine NVIDIA slaine@nvidia.com Jaakko Lehtinen Aalto University NVIDIA jlehtinen@aalto.fi Timo Aila NVIDIA taila@nvidia.com |
| Pseudocode | Yes | See Appendix A in the supplement for pseudocode. |
| Open Source Code | Yes | Source code of our metric is available at https://github.com/kynkaat/improved-precision-and-recall-metric. |
| Open Datasets | Yes | We examine two state-of-the-art generative models, Style GAN [12] trained with the FFHQ dataset, and Big GAN [4] trained on Image Net [5]. |
| Dataset Splits | No | The paper discusses training and testing of models, but it does not explicitly provide details about a validation dataset split (e.g., percentages, sample counts, or explicit mention of a validation set). |
| Hardware Specification | No | The paper does not provide any specific hardware details such as CPU or GPU models used for running the experiments. |
| Software Dependencies | No | The paper references various models and frameworks (e.g., VGG-16, Inception-v3), but it does not list any specific software dependencies or libraries with version numbers required to replicate the experiments. |
| Experiment Setup | Yes | Thus we use k = 3 and |Φ| = 50000 in all our experiments unless stated otherwise. We use Style GAN [12] in all experiments, trained with FFHQ at 1024 1024. reducing the γ parameter by 100 shifts the balance even further (C). |