Elucidating the Exposure Bias in Diffusion Models

Authors: Mang Ning, Mingxiao Li, Jianlin Su, Albert Ali Salah, Itir Onal Ertugrul

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various diffusion frameworks (ADM, DDIM, EDM, LDM, Di T, PFGM++) verify the effectiveness of our method. Remarkably, our ADM-ES, as a state-of-the-art stochastic sampler, obtains 2.17 FID on CIFAR-10 under 100-step unconditional generation.
Researcher Affiliation Collaboration Mang Ning Utrecht University m.ning@uu.nl Mingxiao Li KU Leuven mingxiao.li@cs.kuleuven.be Jianlin Su Moonshot AI Ltd. bojone@spaces.ac.cn Albert Ali Salah Utrecht University a.a.salah@uu.nl Itir Onal Ertugrul Utrecht University i.onalertugrul@uu.nl
Pseudocode Yes Algorithm 1 Variance error under single-step sampling... Algorithm 2 Variance error under multi-step sampling... Algorithm 3 Measurement of Exposure Bias δt
Open Source Code Yes The code is at https://github.com/forever208/ADM-ES
Open Datasets Yes Experiments on various diffusion frameworks (ADM, DDIM, EDM, LDM, Di T, PFGM++) verify the effectiveness of our method. Remarkably, our ADM-ES, as a state-of-the-art stochastic sampler, obtains 2.17 FID on CIFAR-10 under 100-step unconditional generation... CIFAR-10 (Krizhevsky et al., 2009), LSUN tower (Yu et al., 2015) and FFHQ (Karras et al., 2019)... Celeb A 64 64 datasets (Liu et al., 2015)... Image Net 256 256
Dataset Splits No The paper does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) used for training the models or baselines evaluated.
Hardware Specification No The paper does not mention any specific hardware specifications (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We present the complete parameters k, b used in all experiments and the details on the search of k, b in Appendix A.10. Overall, searching for the optimal uniform λ(t) is effortless and takes 6 to 10 trials. In Appendix A.11, we also demonstrate that the FID gain can be achieved within a wide range of λ(t), which indicates the insensitivity of λ(t)... Search for the optimal uniform schedule λ(t) = b in a coarse-to-fine manner: use stride 0.001, 0.0005, 0.0001 progressively.