Interpreting and Improving Diffusion Models from an Optimization Perspective
Authors: Frank Permenter, Chenyang Yuan
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6. Experiments |
| Researcher Affiliation | Industry | 1Toyota Research Institute, Cambridge, Massachusetts, USA. Correspondence to: Chenyang Yuan <chenyang.yuan@tri.global>, Frank Permenter <frank.permenter@tri.global>. |
| Pseudocode | Yes | Algorithm 1 DDIM sampler (Song et al., 2020a) Require: (σN, . . . , σ0), x N N(0, I), ϵθ Ensure: Compute x0 with N evaluations of ϵθ for t = N, . . . , 1 do xt 1 xt + (σt 1 σt)ϵθ(xt, σt) return x0 |
| Open Source Code | Yes | Code for the experiments is available at https: //github.com/Toyota Research Institute/ gradient-estimation-sampler |
| Open Datasets | Yes | We use denoisers from (Ho et al., 2020; Song et al., 2020a) that were pretrained on the CIFAR10 (32x32) and Celeb A (64x64) datasets (Krizhevsky et al., 2009; Liu et al., 2015). |
| Dataset Splits | No | The paper mentions using 'training images' and evaluating on the 'MS COCO validation set', but does not provide specific details on train/validation/test splits for the datasets used to train or evaluate their models. |
| Hardware Specification | Yes | All the experiments were run on a single Nvidia RTX 4090 GPU. |
| Software Dependencies | Yes | We also use Stable Diffusion 2.1 provided in https://huggingface.co/stabilityai/stable-diffusion-2-1. |
| Experiment Setup | Yes | For the CIFAR-10 and Celeb A models, we choose σ1 = q σDDIM(N) 1 and σ0 = 0.01. For CIFAR-10 N = 5, 10, 20, 50 we choose σN = 40 and for Celeb A N = 5, 10, 20, 50 we choose σN = 40, 80, 100, 120 respectively. For Stable Diffusion, we use the same sigma schedule as that in DDIM. ... We found that setting γ = 2 works well for N < 20; for larger N slightly increasing γ also improves sample quality (see Appendix E for more details). |