Fast samplers for Inverse Problems in Iterative Refinement models

Authors: Kushagra Pandey, Ruihan Yang, Stephan Mandt

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed method s performance on various linear image restoration tasks across multiple datasets, employing diffusion and flow-matching models. Notably, on challenging inverse problems like 4 super-resolution on the Image Net dataset, our method can generate high-quality samples in as few as 5 conditional sampling steps and outperforms competing baselines requiring 20-1000 steps.
Researcher Affiliation Academia Kushagra Pandey Department of Computer Science University of California Irvine pandeyk1@uci.edu; Ruihan Yang Department of Computer Science University of California Irvine ruihan.yang@uci.edu; Stephan Mandt Department of Computer Science University of California Irvine mandt@uci.edu
Pseudocode Yes Algorithm 1 Conjugate ΠGDM sampling; Algorithm 2 Conjugate ΠGFM sampling
Open Source Code No Our code will be publicly available at https://github.com/mandt-lab/c-pigdm.
Open Datasets Yes For diffusion models, we utilize an unconditional pre-trained Image Net [Deng et al., 2009] checkpoint at 256 256 resolution from Open AI [Dhariwal and Nichol, 2021]3. For evaluations on the FFHQ dataset Karras et al. [2019], we use a pre-trained checkpoint from Choi et al. [2021] also at 256 256 resolution. For flow model comparisons, we utilize three publicly available model checkpoints from Liu et al. [20223]4, trained on the AFHQ-Cat [Choi et al., 2020], LSUN-Bedroom Yu et al. [2015], and Celeb A-HQ [Karras et al., 2018] datasets.
Dataset Splits Yes For flows, we conduct evaluations on the entire validation set.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments, nor does it provide specific GPU/CPU models, processor types, or memory details.
Software Dependencies No For numerical approximation of these integrals, we use the odeint method from the torchdiffeq package [Chen, 2018] with parameters atol=1e-5, rtol=1e-5 and the RK45 solver [Dormand and Prince, 1980]. While specific packages are mentioned, their version numbers are not provided (e.g., torchdiffeq [Chen, 2018] is cited by year but no version).
Experiment Setup Yes We conduct an extensive search to optimize the parameters w, λ and τ to identify the best-performing configuration based on sample quality. For diffusion baselines, we include DDRM [Kawar et al., 2022], DPS [Chung et al., 2022a], and ΠGDM [Song et al., 2022]. As recommended for DPS [Chung et al., 2022a], we use NFE=1000 for all tasks. For DDRM, we adhere to the original implementation and run it with ηb = 1.0 and η = 0.85 at NFE=20. We test our implementation of ΠGDM (see Section 2.2), with NFE values of 5, 10, and 20 and use the recommended guidance schedule of wt = r2 t across all tasks. For flow models, we consider the recently proposed method inspired by ΠGDM running on OT-ODE path by Pokle et al. [2024] (which we refer to as ΠGFM; see Appendix B), and similarly run it with NFE values of 5, 10, and 20. We optimize all baselines by conducting an extensive grid search over w and τ for the best performance (in terms of sample quality).