reproducibilityindex.ai

D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation

Authors: Abhishek Sinha, Jiaming Song, Chenlin Meng, Stefano Ermon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate and compare D2C with several state-of-the-art generative models over 6 datasets. On unconditional generation, D2C outperforms state-of-the-art VAEs and is competitive with diffusion models under similar computational budgets. On conditional generation with 100 labeled examples, D2C signiﬁcantly outperforms state-of-the-art VAE [91] and diffusion models [84]. We report sample quality results7 in Tables 2, and 3.
Researcher Affiliation	Academia	Abhishek Sinha Department of Computer Science Stanford University a7b23@stanford.edu Jiaming Song Department of Computer Science Stanford University tsong@cs.stanford.edu Chenlin Meng Department of Computer Science Stanford University chenlin@cs.stanford.edu Stefano Ermon Department of Computer Science Stanford University ermon@cs.stanford.edu
Pseudocode	Yes	Algorithm 1 Conditional generation with D2C
Open Source Code	Yes	We release our code at https: //github.com/jiamings/d2c.
Open Datasets	Yes	We examine the conditional and unconditional generation qualities of D2C over CIFAR-10 [53], CIFAR-100 [53], f Mo W [21], Celeb A-64 [58], Celeb A-HQ-256 [48], and FFHQ-256 [49].
Dataset Splits	No	The paper mentions “training images” and “test set” but does not provide specific percentages or counts for train/validation/test splits, nor does it describe cross-validation setups.
Hardware Specification	Yes	On the same Nvidia 1080Ti GPU, it takes 0.013 seconds to obtain the latent code in D2C, while the same takes 8 seconds [106] for Style GAN2 (615 slower).
Software Dependencies	No	The paper mentions software components like “NVAE autoencoder structure”, “U-Net diffusion model”, and “MoCo-v2 contrastive representation learning method” but does not provide specific version numbers for these or for broader software frameworks like PyTorch or TensorFlow.
Experiment Setup	Yes	For the contrastive weight λ in Equation (4), we consider the value of λ = 10 4 based on the relative scale between the LC and LD2; we ﬁnd that the results are relatively insensitive to λ. We use 100 diffusion steps for DDIM and D2C unless mentioned otherwise, as running with longer steps is not computationally economical despite tiny gains in FID [84]. We include additional training details, such as architectures, optimizers and learning rates in Appendix C.