Accelerating Guided Diffusion Sampling with Splitting Numerical Methods
Authors: Suttisak Wizadwongsa, Supasorn Suwajanakorn
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 4 EXPERIMENTS: Extending on our observation that classical high-order methods failed on guided sampling, we conducted a series of experiments to investigate this problem and evaluate our solution. ... We measure image similarity using Learned Perceptual Image Patch Similarity (LPIPS) (Zhang et al., 2018) (lower is better) and measure sampling time on a single NVIDIA RTX 3090 and a 24-core AMD Threadripper 3960x. |
| Researcher Affiliation | Academia | Suttisak Wizadwongsa, Supasorn Suwajanakorn VISTEC, Thailand {suttisak.w s19, supasorn.s}@vistec.ac.th |
| Pseudocode | Yes | Algorithm 1: Lie-Trotter Splitting (LTSP) sample x0 N(0, σ2 max I) ; for n {0, ..., N 1} do yn+1 = PLMS( xn, σn, σn+1, ϵσ); xn+1 = yn+1 (σn+1 σn) f(yn+1) ; end Result: x N |
| Open Source Code | No | No specific statement or link provided regarding the release of their own source code for the methodology. |
| Open Datasets | Yes | For our comparison, we use pre-trained state-of-the-art diffusion models and classifiers from Dhariwal & Nichol (2021), which were trained on the Image Net dataset (Russakovsky et al., 2015) with 1,000 total sampling steps. |
| Dataset Splits | No | Not found. The paper mentions pre-trained models and evaluates on a test set but does not provide explicit train/validation/test splits for its own experimental setup. |
| Hardware Specification | Yes | We measure image similarity using Learned Perceptual Image Patch Similarity (LPIPS) (Zhang et al., 2018) (lower is better) and measure sampling time on a single NVIDIA RTX 3090 and a 24-core AMD Threadripper 3960x. |
| Software Dependencies | No | No specific software versions (e.g., Python 3.x, PyTorch 1.x) are mentioned. Only methods and models are cited. |
| Experiment Setup | Yes | We treat full-path samples from a classifier-guided DDIM at 1,000 steps as reference solutions. Then, the performance of each configuration is measured by the image similarity between its generated samples using fewer steps and the reference DDIM samples, both starting from the same initial noise map. ... Following (Dhariwal & Nichol, 2021), we use a 25-step DDIM as a baseline, which already produces visually reasonable results. As PLMS and LTSP require the same number of network evaluations as the DDIM, they are used also with 25 steps. For STSP with a slower evaluation time, it is only allowed 20 steps, which is the highest number of steps such that its sampling time is within that of the baseline 25-step DDIM. |