Improved Precision and Recall Metric for Assessing Generative Models

Authors: Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, Timo Aila

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our metric in Style GAN and Big GAN by providing several illustrative examples where existing metrics yield uninformative or contradictory results. Furthermore, we analyze multiple design variants of Style GAN to better understand the relationships between the model architecture, training methods, and the properties of the resulting sample distribution. We demonstrate the effectiveness of our metric using two recent generative models (Section 3), Style GAN [12] and Big GAN [4].
Researcher Affiliation Collaboration Tuomas Kynkäänniemi Aalto University NVIDIA tuomas.kynkaanniemi@aalto.fi Tero Karras NVIDIA tkarras@nvidia.com Samuli Laine NVIDIA slaine@nvidia.com Jaakko Lehtinen Aalto University NVIDIA jlehtinen@aalto.fi Timo Aila NVIDIA taila@nvidia.com
Pseudocode Yes See Appendix A in the supplement for pseudocode.
Open Source Code Yes Source code of our metric is available at https://github.com/kynkaat/improved-precision-and-recall-metric.
Open Datasets Yes We examine two state-of-the-art generative models, Style GAN [12] trained with the FFHQ dataset, and Big GAN [4] trained on Image Net [5].
Dataset Splits No The paper discusses training and testing of models, but it does not explicitly provide details about a validation dataset split (e.g., percentages, sample counts, or explicit mention of a validation set).
Hardware Specification No The paper does not provide any specific hardware details such as CPU or GPU models used for running the experiments.
Software Dependencies No The paper references various models and frameworks (e.g., VGG-16, Inception-v3), but it does not list any specific software dependencies or libraries with version numbers required to replicate the experiments.
Experiment Setup Yes Thus we use k = 3 and |Φ| = 50000 in all our experiments unless stated otherwise. We use Style GAN [12] in all experiments, trained with FFHQ at 1024 1024. reducing the γ parameter by 100 shifts the balance even further (C).