Assessing Generative Models via Precision and Recall

Authors: Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the practical utility of the proposed approach we perform an empirical study on several variants of Generative Adversarial Networks and Variational Autoencoders. In an extensive set of experiments we show that the proposed metric is able to disentangle the quality of generated samples from the coverage of the target distribution.
Researcher Affiliation Collaboration Mehdi S. M. Sajjadi MPI for Intelligent Systems, Max Planck ETH Center for Learning Systems Olivier Bachem Google Brain Mario Lucic Google Brain Olivier Bousquet Google Brain Sylvain Gelly Google Brain. This work was done during an internship at Google Brain.
Pseudocode No The paper describes the algorithm mathematically and verbally, but does not provide a formal pseudocode block or algorithm listing.
Open Source Code Yes An implementation of the algorithm is available at https://github.com/msmsajjadi/precision-recall-distributions.
Open Datasets Yes We consider three data sets commonly used in the GAN literature: MNIST [15], Fashion-MNIST [25], and CIFAR-10 [13]... we use the Multi NLI corpus... Following [6], we embed these sentences using a Bi LSTM with 2048 cells in each direction and max pooling, leading to a 4096-dimensional embedding [7].
Dataset Splits Yes Then, for a fixed i = 1, . . . , 10, we generate a set ˆQi, which consists of samples from the first i classes from the training set.
Hardware Specification No The paper does not specify the exact hardware (e.g., CPU, GPU models, memory) used for running the experiments. It only implies computations were performed on standard machines.
Software Dependencies No The paper does not specify version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes We then cluster the union of ˆP and ˆQ in this feature space using mini-batch k-means with k = 20 [21]. As the clustering algorithm is randomized, we run the procedure several times and average over the PRD curves... Following [6], we embed these sentences using a Bi LSTM with 2048 cells in each direction and max pooling, leading to a 4096-dimensional embedding [7].