reproducibilityindex.ai

ViTGAN: Training GANs with Vision Transformers

Authors: Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, our approach, named Vi TGAN, achieves comparable performance to the leading CNNbased GAN models on three datasets: CIFAR-10, Celeb A, and LSUN bedroom.
Researcher Affiliation	Collaboration	Kwonjoon Lee1,3 Huiwen Chang2 Lu Jiang2 Han Zhang2 Zhuowen Tu1 Ce Liu4 1UC San Diego 2Google Research 3Honda Research Institute 4Microsoft Azure AI kwl042@eng.ucsd.edu {huiwenchang,lujiang,zhanghan}@google.com ztu@ucsd.edu ce.liu@microsoft.com
Pseudocode	No	The paper includes mathematical equations and architectural diagrams (e.g., Figure 1, Figure 2) but no explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Empirically, our approach, named Vi TGAN, achieves comparable performance to the leading CNNbased GAN models on three datasets: CIFAR-10, Celeb A, and LSUN bedroom. Our code is available online1. 1https://github.com/mlpc-ucsd/Vi TGAN
Open Datasets	Yes	We train and evaluate our model on various standard benchmarks for image generation, including CIFAR-10 (Krizhevsky et al., 2009), LSUN bedroom (Yu et al., 2015) and Celeb A (Liu et al., 2015).
Dataset Splits	Yes	The LSUN bedroom dataset (Yu et al., 2015) is a large-scale image generation benchmark, consisting of 3 million training images and 300 images for validation. On this dataset, FID is computed against the training set due to the small validation set.
Hardware Specification	Yes	Both Vi TGAN and Style GAN2 are based on Tensorflow 2 implementation2 and trained on Google Cloud TPU v2-32 and v3-8.
Software Dependencies	Yes	Both Vi TGAN and Style GAN2 are based on Tensorflow 2 implementation2 and trained on Google Cloud TPU v2-32 and v3-8.
Experiment Setup	Yes	We train our models with Adam with β1 = 0.0, β2 = 0.99, and a learning rate of 0.002 following the practice of (Karras et al., 2020b). In addition, we employ non-saturating logistic loss (Goodfellow et al., 2014), exponential moving average of generator weights (Karras et al., 2018), and equalized learning rate (Karras et al., 2018). We use a mini-batch size of 128.