reproducibilityindex.ai

Neural Diffusion Models

Authors: Grigory Bartosh, Dmitry Vetrov, Christian A. Naesseth

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present Neural Diffusion Models (NDMs), a generalization of conventional diffusion models that enables defining and learning time-dependent non-linear transformations of data. We show how to optimise NDMs using a variational bound in a simulation-free setting. Moreover, we derive a time-continuous formulation of NDMs, which allows fast and reliable inference using off-the-shelf numerical ODE and SDE solvers. Finally, we demonstrate the utility of NDMs through experiments on many image generation benchmarks, including MNIST, CIFAR-10, downsampled versions of Image Net and Celeb A-HQ. NDMs outperform conventional diffusion models in terms of likelihood, achieving state-of-the-art results on Image Net and Celeb AHQ, and produces high-quality samples.
Researcher Affiliation	Academia	1University of Amsterdam 2Constructor University, Bremen. Correspondence to: Grigory Bartosh <g.bartosh@uva.nl>, Dmitry Vetrov <dvetrov@constructor.university>, Christian A. Naesseth <c.a.naesseth@uva.nl>.
Pseudocode	Yes	Algorithm 1 Learning NDM Algorithm 2 Sampling from NDM
Open Source Code	No	The paper does not provide a specific repository link, explicit code release statement, or indicate that code is in supplementary materials.
Open Datasets	Yes	We demonstrate NDMs with learnable transformations on the MNIST (Deng, 2012), CIFAR-10 (Krizhevsky et al., 2009), downsampled Image Net (Deng et al., 2009; Van Den Oord et al., 2016) and Celeb A-HQ-256 (Karras et al., 2017) datasets.
Dataset Splits	No	The paper mentions using test data but does not explicitly provide information on validation dataset splits, only stating that NLL and NELBO are calculated on test data.
Hardware Specification	Yes	The training was performed using Tesla V100 GPUs.
Software Dependencies	No	The paper mentions using the RK45 solver and U-Net architecture but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	The hyper-parameters are presented in Table 6. In all experiments we use same neural network architectures to parameterize both the generative process and the transformations Fφ. To facilitate the training process, we employed a polynomial decay learning rate schedule, which includes a warm-up phase for a specified number of training steps. During the warm-up phase, the learning rate is linearly increased from 10 8 to the peak learning rate. Once the peak learning rate is reached, the learning rate is linearly decayed to 10 8 until the final training step.