Elucidating the Exposure Bias in Diffusion Models
Authors: Mang Ning, Mingxiao Li, Jianlin Su, Albert Ali Salah, Itir Onal Ertugrul
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various diffusion frameworks (ADM, DDIM, EDM, LDM, Di T, PFGM++) verify the effectiveness of our method. Remarkably, our ADM-ES, as a state-of-the-art stochastic sampler, obtains 2.17 FID on CIFAR-10 under 100-step unconditional generation. |
| Researcher Affiliation | Collaboration | Mang Ning Utrecht University m.ning@uu.nl Mingxiao Li KU Leuven mingxiao.li@cs.kuleuven.be Jianlin Su Moonshot AI Ltd. bojone@spaces.ac.cn Albert Ali Salah Utrecht University a.a.salah@uu.nl Itir Onal Ertugrul Utrecht University i.onalertugrul@uu.nl |
| Pseudocode | Yes | Algorithm 1 Variance error under single-step sampling... Algorithm 2 Variance error under multi-step sampling... Algorithm 3 Measurement of Exposure Bias δt |
| Open Source Code | Yes | The code is at https://github.com/forever208/ADM-ES |
| Open Datasets | Yes | Experiments on various diffusion frameworks (ADM, DDIM, EDM, LDM, Di T, PFGM++) verify the effectiveness of our method. Remarkably, our ADM-ES, as a state-of-the-art stochastic sampler, obtains 2.17 FID on CIFAR-10 under 100-step unconditional generation... CIFAR-10 (Krizhevsky et al., 2009), LSUN tower (Yu et al., 2015) and FFHQ (Karras et al., 2019)... Celeb A 64 64 datasets (Liu et al., 2015)... Image Net 256 256 |
| Dataset Splits | No | The paper does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) used for training the models or baselines evaluated. |
| Hardware Specification | No | The paper does not mention any specific hardware specifications (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We present the complete parameters k, b used in all experiments and the details on the search of k, b in Appendix A.10. Overall, searching for the optimal uniform λ(t) is effortless and takes 6 to 10 trials. In Appendix A.11, we also demonstrate that the FID gain can be achieved within a wide range of λ(t), which indicates the insensitivity of λ(t)... Search for the optimal uniform schedule λ(t) = b in a coarse-to-fine manner: use stride 0.001, 0.0005, 0.0001 progressively. |