Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation
Authors: Diederik Kingma, Ruiqi Gao
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we explore new monotonic weightings and demonstrate their effectiveness, achieving state-of-the-art FID scores on the high-resolution Image Net benchmark. |
| Researcher Affiliation | Industry | Diederik P. Kingma Google Deep Mind durk@google.com Ruiqi Gao Google Deep Mind ruiqig@google.com |
| Pseudocode | No | The paper describes mathematical derivations and experimental setups but does not include pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific links to open-source code for the methodology it describes, nor does it explicitly state that its code is being released. |
| Open Datasets | Yes | Samples generated from our VDM++ diffusion models trained on the Image Net dataset; see Section 5 for details and Appendix M for more samples. |
| Dataset Splits | Yes | We trained the model for 700k iterations and reported the performance of the checkpoint giving the best FID score (checkpoints were saved and evaluated on every 20k iterations). |
| Hardware Specification | Yes | We employed 128 TPU-v4 chips with a batch size of 4096 (32 per chip). |
| Software Dependencies | No | The paper mentions that 'The model is optimized by Adam [Kingma and Ba, 2014]' but does not provide specific version numbers for Adam or any other software dependencies. |
| Experiment Setup | Yes | The model was trained with learning rate 1e-4, exponential moving average of 50 million images and learning rate warmup of 10 million images, which mainly follows the configuration of Karras et al. [2022]. |