DiracDiffusion: Denoising and Incremental Reconstruction with Assured Data-Consistency

Authors: Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the efficiency of our method on different high-resolution datasets and inverse problems, achieving great improvements over other state-of-the-art diffusion-based methods with respect to both perceptual and distortion metrics1. ... 4. Experiments Experimental setup We evaluate our method on Celeb A-HQ (256 256) (Karras et al., 2018) and Image Net (256 256) (Deng et al., 2009). ... Table 1. Experimental results on the FFHQ (top) and Image Net (bottom) test splits.
Researcher Affiliation Academia Zalan Fabian 1 Berk Tinaz 1 Mahdi Soltanolkotabi 1 1Dept. of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA.
Pseudocode Yes Algorithm 1 Dirac Input: y: noisy observation, Φθ: score network, At( ): degradation function, t: step size, σt: noise std at time t, ηt: guidance step size, t [0, 1], tstop: early-stopping parameter N 1/ t y y for i = 1 to N do t 1 t i if t tstop then break {Early-stopping} end if z N(0, σ2 t I) ˆx0 Φθ(y, t) {Predict posterior mean} yr At t(ˆx0) At(ˆx0) {Incremental reconstruction} yd σ2 t t σ2 t σ2 t (At(ˆx0) y) {Denoising} yg (σ2 t t σ2 t ) y y A1(ˆx0) 2 {Guidance} y y + yr + yd + ηtyg + q σ2 t σ2 t tz end for Output: y {Alternatively, output ˆx0 (see Appendix D)}
Open Source Code Yes 1Code is available at https://github.com/ z-fabian/dirac-diffusion
Open Datasets Yes We evaluate our method on Celeb A-HQ (256 256) (Karras et al., 2018) and Image Net (256 256) (Deng et al., 2009).
Dataset Splits Yes For Celeb A-HQ training, we use 80% of the dataset for training, and the rest for validation and testing. For Image Net experiments, we sample 1 image from each class from the official validation split to create disjoint validation and test sets of 1k images each.
Hardware Specification Yes We train all models with Adam optimizer, with learning rate 0.0001 and batch size 32 on 8 Titan RTX GPUs, with the exception of the large model used for Image Net inpainting experiments which we trained on 8 A6000 GPUs.
Software Dependencies No The paper mentions software components like 'Adam optimizer' and uses specific model architectures (NCSN++, SDE-VP models), but does not provide specific version numbers for any libraries, programming languages (e.g., Python), or frameworks (e.g., PyTorch, TensorFlow, CUDA) used.
Experiment Setup Yes Training details We train all models with Adam optimizer, with learning rate 0.0001 and batch size 32 on 8 Titan RTX GPUs... Sampling hyperparameters The settings are summarized in Table 3. We tune the reverse process hyper-parameters on validation data. For the interpretation of guidance scaling we refer the reader to the explanation of guidance step size methods in Section B. In Table 3, output refers to whether the final reconstruction is the last model output (posterior mean estimate, ˆx0 = Φθ(yt, t)) or the final iterate yt.