reproducibilityindex.ai

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

Authors: Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Yutong He, Yuki Mitsufuji, Stefano Ermon

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	CTM achieves new state-of-the-art FIDs for single-step diffusion model sampling on CIFAR-10 (FID 1.73) and Image Net at 64 64 resolution (FID 1.92). CTM also enables a new family of sampling schemes, both deterministic and stochastic, involving long jumps along the ODE solution trajectories. It consistently improves sample quality as computational budgets increase, avoiding the degradation seen in CM. Furthermore, unlike CM, CTM s access to the score function can streamline the adoption of established controllable/conditional generation methods from the diffusion community. This access also enables the computation of likelihood.
Researcher Affiliation	Collaboration	Dongjun Kim & Chieh-Hsin Lai Sony AI Tokyo, Japan dongjun@stanford.edu, chieh-hsin.lai@sony.com Wei-Hsiang Liao & Naoki Murata & Yuhta Takida & Toshimitsu Uesaka Sony AI Tokyo, Japan Yutong He Carnegie Mellon University PA, USA Yuki Mitsufuji Sony AI, Sony Group Corporation Tokyo, Japan Stefano Ermon Stanford University CA, USA
Pseudocode	Yes	Algorithm 2 CTM s γ-sampling, Algorithm 3 Loss-based Trajectory Optimization, Algorithm 4 CTM Training
Open Source Code	Yes	The code is available at https://github.com/sony/ctm.
Open Datasets	Yes	We evaluate CTM on CIFAR-10 and Image Net 64 64, using the pre-trained diffusion checkpoints from EDM (CIFAR-10) and CM (Image Net) as the teacher models.
Dataset Splits	Yes	We evaluate CTM on CIFAR-10 and Image Net 64 64. Table 2 includes "Validation Data" metrics, indicating the use of a validation set. The use of standard datasets like CIFAR-10 and ImageNet implies well-defined standard splits are used.
Hardware Specification	Yes	We use 4 V100 (16G) GPUs for CIFAR-10 experiments and 8 A100 (40G) GPUs for Image Net experiments.
Software Dependencies	No	The paper mentions "official Py Torch code of CM" and specific model implementations like "EDM’s DDPM++ implementation" but does not provide specific version numbers for software dependencies like PyTorch, CUDA, etc.
Experiment Setup	Yes	Table 4 "Experimental details on hyperparameters." lists specific values for Learning rate, Student's stop-grad EMA parameter µ, N, ODE solver, Max. ODE steps, EMA decay rate, Training iterations, Mixed-Precision (FP16), Batch size, and Number of GPUs. The text also details specific values like "σmin = 0.002, σmax = 80, ρ = 7, and σdata = 0.5."