A Variational Perspective on Solving Inverse Problems with Diffusion Models
Authors: Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments for various linear and nonlinear image restoration tasks demonstrate the strengths of our method compared with state-of-the-art sampling-based diffusion models. |
| Researcher Affiliation | Industry | Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat NVIDIA Inc. mmardani,jiamings,jkautz,avahdat@nvidia.com |
| Pseudocode | Yes | Algorithm 1 Variational sampler (RED-diff) |
| Open Source Code | Yes | The code is available online 1. 1https://github.com/NVlabs/RED-diff |
| Open Datasets | Yes | For the prior, we adopt publicly available checkpoints from the guided diffusion model2 that is pretrained based on 256ˆ256 Image Net (Russakovsky et al., 2015); see details in the appendix. [...] We use the multi-coil fast MRI brain dataset Zbontar et al. (2018) with 1D equispaced undersampling, and the fully-sampled 3D fast-spin echo multi-coil knee MRI dataset from Ong et al. (2018) with 2D Poisson Disc undersampling mask, as in Jalal et al. (2021). |
| Dataset Splits | Yes | For the proof of concept, we report findings for various linear and nonlinear image restoration tasks for a 1k subset of Image Net (Russakovsky et al., 2015) validation dataset3. |
| Hardware Specification | Yes | Across all methods, we also use a batch size of 10 using RTX 6000 Ada GPU with 48GB RAM. [...] All methods run on a single NVIDIA RTX 6000 Ada GPU with 48GB RAM. |
| Software Dependencies | No | The paper mentions using Adam optimizer and diffusion models (implying frameworks like PyTorch or TensorFlow, often with CUDA), but it does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | For our variational sampler we adopt Adam optimizer with 1, 000 steps, and set the momentum pair p0.9, 0.99q and initial learning rate 0.1. No weight decay regularization is used. The optimizer is initialized with the degraded image input. We also choose descending time stepping from t T to t 1 as demonstrated by the ablations later in Section 5.3.2. Across all methods, we also use a batch size of 10 using RTX 6000 Ada GPU with 48GB RAM. |