Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

Authors: Zhisheng Xiao, Karsten Kreis, Arash Vahdat

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive evaluations, we show that denoising diffusion GANs obtain sample quality and diversity competitive with original diffusion models while being 2000 faster on the CIFAR-10 dataset.
Researcher Affiliation Collaboration Zhisheng Xiao The University of Chicago zxiao@uchicago.edu Karsten Kreis NVIDIA kkreis@nvidia.com Arash Vahdat NVIDIA avahdat@nvidia.com
Pseudocode No The paper describes methods and processes in text but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Project page and code: https://nvlabs.github.io/denoising-diffusion-gan.
Open Datasets Yes Through extensive evaluations, we show that denoising diffusion GANs obtain sample quality and diversity competitive with original diffusion models while being 2000 faster on the CIFAR-10 dataset. We train our model on datasets with larger images, including Celeb A-HQ (Karras et al., 2018) and LSUN Church (Yu et al., 2015) at 256 256px resolution.
Dataset Splits No The paper evaluates on datasets like CIFAR-10 but does not specify the explicit training, validation, and test splits (e.g., percentages or exact counts) used for these datasets within the paper.
Hardware Specification Yes When evaluating sampling time, we use models trained on CIFAR-10 and generate a batch of 100 images on a V100 GPU. We train our models on CIFAR-10 using 4 V100 GPUs. On Celeb A-HQ and LSUN Church we use 8 V100 GPUs.
Software Dependencies Yes We use Pytorch 1.9.0 and CUDA 11.0.
Experiment Setup Yes For all datasets, we set the number of diffusion steps to be 4. Initial learning rate for discriminator 10^-4, Initial learning rate for generator 1.6 x 10^-4 (or 2 x 10^-4 for LSUN Church) Adam optimizer beta1 0.5, Adam optimizer beta2 0.9, EMA 0.9999 (or 0.999), Batch size 256 (or 32, 64), # of training iterations 400k (or 750k, 600k).