reproducibilityindex.ai

Generative Adversarial Transformers

Authors: Drew A Hudson, Larry Zitnick

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the model s strength and robustness through a careful evaluation over a range of datasets, from simulated multi-object environments to rich real-world indoor and outdoor scenes, showing it achieves state-of-theart results in terms of image quality and diversity, while enjoying fast learning and better dataefﬁciency. Further qualitative and quantitative experiments offer us an insight into the model s inner workings, revealing improved interpretability and stronger disentanglement, and illustrating the beneﬁts and efﬁcacy of our approach. We investigate the GANsformer through a suite of experiments to study its quantitative performance and qualitative behavior.
Researcher Affiliation	Collaboration	Drew A. Hudson 1 C. Lawrence Zitnick 2 1Computer Science Department, Stanford University, CA, USA 2Facebook AI Research, CA, USA. Correspondence to: Drew A. Hudson <dorarad@cs.stanford.edu>.
Pseudocode	No	The paper describes the Simplex Attention and Duplex Attention using mathematical formulations and prose, but it does not include a clearly labeled pseudocode block or algorithm figure.
Open Source Code	Yes	An implementation of the model is available at https: //github.com/dorarad/gansformer.
Open Datasets	Yes	We investigate the GANsformer through a suite of experiments to study its quantitative performance and qualitative behavior. As we will see below, the GANsformer achieves state-of-the-art results, successfully producing high-quality images for a varied assortment of datasets: FFHQ for human faces (Karras et al., 2019), CLEVR for multi-object scenes (Johnson et al., 2017), and the LSUN-Bedroom (Yu et al., 2015) and Cityscapes (Cordts et al., 2016) datasets for challenging indoor and outdoor scenes.
Dataset Splits	No	The paper mentions training models with images of 256x256 resolution for a certain number of training steps, and evaluates them on various metrics, but it does not specify explicit training, validation, and test dataset splits (e.g., percentages or exact counts) for reproducibility.
Hardware Specification	Yes	All models have been trained with images of 256 256 resolution and for the same number of training steps, roughly spanning a week on 2 NVIDIA V100 GPUs per model (or equivalently 3-4 days using 4 GPUs).
Software Dependencies	No	The paper states 'we implement them all within the codebase introduced by the Style GAN authors. The only exception to that is the recent VQGAN model for which we use the ofﬁcial implementation.' However, it does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	No	The paper mentions adopting 'settings and techniques used in the Style GAN and Style GAN2 models (Karras et al., 2019; 2020), including in particular style mixing, stochastic variation, exponential moving average for weights, and a non-saturating logistic loss with lazy R1 regularization.' It also states 'All models have been trained with images of 256x256 resolution and for the same number of training steps.' However, it defers specific hyperparameter settings to 'supplementary material A for further implementation details, hyperparameter settings and training conﬁguration,' indicating these details are not in the main text.