On Self-Supervised Image Representations for GAN Evaluation

Authors: Stanislav Morozov, Andrey Voynov, Artem Babenko

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With extensive comparison of the recent GANs on the standard datasets, we demonstrate that self-supervised representations produce a more reasonable ranking of models in terms of FID/Precision/Recall, while the ranking with classification-pretrained embeddings often can be misleading.
Researcher Affiliation Collaboration Stanislav Morozov Marchuk Institute of Numerical Mathematics RAS stanis-morozov@yandex.ru Andrey Voynov Yandex avoin@yandex-team.ru Artem Babenko Yandex HSE University artem.babenko@phystech.edu
Pseudocode No The paper describes methods in text and references external autoencoder techniques, but does not include any pseudocode or algorithm blocks.
Open Source Code Yes We release the code for the self-supervised GAN evaluation along with data and human labeling reported in the paper online1. 1https://github.com/stanis-morozov/self-supervised-gan-eval
Open Datasets Yes Celeba HQ 1024x1024 (Karras et al., 2017)... FFHQ 1024x1024 (Karras et al., 2019a)... LSUN Bedroom 256x256 (Yu et al., 2015)... LSUN Church 256x256 (Yu et al., 2015)... Imagenet 128x128 (Deng et al., 2009)... The Celeba dataset (Liu et al., 2018) provides labels of 40 attributes for each image...
Dataset Splits No To compute the metrics, we use 30k real and synthetic images; (for Celeba HQ, FFHQ, LSUN Bedroom); To compute the metrics, we use 100k real and synthetic images; (for LSUN Church); To compute the metrics, we use 50k images (50 per class). (for Imagenet).
Hardware Specification No The paper does not explicitly describe the hardware specifications (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper mentions various models and their checkpoints, such as Inception V3, Resnet50, Sw AV, Deep Cluster V2, and Mo Co V2, along with the GAN models used. However, it does not specify the versions of software dependencies like Python, PyTorch/TensorFlow, or CUDA.
Experiment Setup Yes To compute the metrics, we use 30k real and synthetic images;... We include this model since self-supervised models employ Resnet50, therefore, it is important to demonstrate that better GAN ranking comes from the training objective rather than the deeper architecture;... For each attribute, we train 4-layer feedforward neural network with 2048 neurons on each layer with cross-entropy loss, which learns to predict the attribute from the Sw AV/Inception embedding. ... In this experiment, we employ a Resnet50 classifier with a binary cross-entropy loss.