Relay Diffusion: Unifying diffusion process across resolutions for image synthesis

Authors: Jiayan Teng, Wendi Zheng, Ming Ding, Wenyi Hong, Jianqiao Wangni, Zhuoyi Yang, Jie Tang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the effectiveness of RDM on unconditional Celeb A-HQ 256 256 and conditional Image Net 256 256 datasets. RDM achieves state-of-the-art FID on Celeb A-HQ and s FID on Image Net.
Researcher Affiliation Collaboration 1Tsinghua University 2Zhipu AI
Pseudocode Yes Algorithm 1 the RDM second-order stochastic sampler
Open Source Code Yes All the codes and checkpoints are open-sourced at https://github.com/THUDM/Relay Diffusion.
Open Datasets Yes We use Celeb A-HQ and Image Net in our experiments. Celeb A-HQ (Karras et al., 2018) is a high-quality subset of Celeb A (Liu et al., 2015)... Image Net (Deng et al., 2009) contains 1,281,167 images spanning 1000 classes...
Dataset Splits No The paper mentions using Celeb A-HQ and Image Net for training but does not explicitly state the specific training/validation/test splits of these datasets used for its experiments. It refers to standard datasets but doesn't detail the splits.
Hardware Specification Yes On Image Net, the first stage model was trained on 32 V100 for 13 days according to EDM (Karras et al., 2022) and the second stage model (64 256) was trained on 64 40G-A100 for 12.5 days. On Celeb A-HQ, we trained the first stage model on 32 40G-A100 for 16 hours and the second stage model (64 256) on 32 40G-A100 for 25.5 hours.
Software Dependencies No The paper refers to following the EDM formulation and implementation, and using released checkpoints from EDM, but it does not specify concrete software dependencies with version numbers (e.g., PyTorch version, CUDA version).
Experiment Setup Yes Hyperparameters we use for the training of RDM are presented in Table 4.