Directly Denoising Diffusion Models

Authors: Dan Zhang, Jingjing Wang, Feng Luo

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we demonstrate the effectiveness of DDDMs across various image datasets including CIFAR-10 (Krizhevsky et al., 2009), and Image Net 64x64 (Deng et al., 2009), and observe comparable results to current state-of-the-art methods. Our model achieves FID scores of 2.57 and 2.33 on CIFAR-10 in one-step and two-step sampling respectively. By extending the sampling to 1000 steps, we further reduce FID score to 1.79.
Researcher Affiliation Academia 1School of Computing, Clemson University, USA.
Pseudocode Yes Algorithm 1 Training Algorithm 2 Sampling
Open Source Code Yes Our code is available at https://github.com/ The Luo Feng Lab/DDDM.
Open Datasets Yes To evaluate our method for image generation, we train several DDDMs on CIFAR-10 (Krizhevsky et al., 2009) and Image Net 64x64 (Deng et al., 2009) and benchmark their performance with competing methods in the literature.
Dataset Splits No The paper states that FID is computed between 50K generated samples and the whole training set, but does not specify the explicit train/validation/test dataset splits used for their model training and evaluation.
Hardware Specification Yes All models are trained on 8 Nvidia A100 GPUs.
Software Dependencies No The paper mentions using 'Adam' for experiments but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes For CIFAR-10, we set T = 1000 for baseline model and train the model for 1000 epochs with a constant learning rate of 0.0002 and batch size of 1024. ... For Image Net 64x64, ... we train the model for 520 epochs with a constant learning rate of 0.0001 and batch size of 1024. We use an exponential moving average (EMA) of the weights during training with a decay factor of 0.9999 for all the experiments.