Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation

Authors: Diederik Kingma, Ruiqi Gao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we explore new monotonic weightings and demonstrate their effectiveness, achieving state-of-the-art FID scores on the high-resolution Image Net benchmark.
Researcher Affiliation Industry Diederik P. Kingma Google Deep Mind durk@google.com Ruiqi Gao Google Deep Mind ruiqig@google.com
Pseudocode No The paper describes mathematical derivations and experimental setups but does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any specific links to open-source code for the methodology it describes, nor does it explicitly state that its code is being released.
Open Datasets Yes Samples generated from our VDM++ diffusion models trained on the Image Net dataset; see Section 5 for details and Appendix M for more samples.
Dataset Splits Yes We trained the model for 700k iterations and reported the performance of the checkpoint giving the best FID score (checkpoints were saved and evaluated on every 20k iterations).
Hardware Specification Yes We employed 128 TPU-v4 chips with a batch size of 4096 (32 per chip).
Software Dependencies No The paper mentions that 'The model is optimized by Adam [Kingma and Ba, 2014]' but does not provide specific version numbers for Adam or any other software dependencies.
Experiment Setup Yes The model was trained with learning rate 1e-4, exponential moving average of 50 million images and learning rate warmup of 10 million images, which mainly follows the configuration of Karras et al. [2022].