Solving Inverse Problems with Latent Diffusion Models via Hard Data Consistency

Authors: Bowen Song, Soo Min Kwon, Zecheng Zhang, Xinyu Hu, Qing Qu, Liyue Shen

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With extensive experiments on multiple tasks and various datasets, encompassing both natural and medical images, our proposed algorithm achieves state-of-the-art performance on a variety of linear and nonlinear inverse problems. 4 EXPERIMENTS We conduct experiments to solve both linear and nonlinear inverse problems on natural and medical images.
Researcher Affiliation Collaboration 1University of Michigan, 2Kumo.AI, 3Microsoft
Pseudocode Yes Algorithm 1 Re Sample: Solving Inverse Problems with Latent Diffusion Models
Open Source Code Yes Lastly, our code is available at https://github.com/soominkwon/resample.
Open Datasets Yes For the experiments on natural images, we use datasets FFHQ (Kazemi & Sullivan, 2014), Celeb A-HQ (Liu et al., 2015), and LSUN-Bedroom (Yu et al., 2016) with the image resolution of 256 256 3. For experiments on medical images, we fine-tune LDMs on 2000 2D computed tomography (CT) images with image resolution of 256 256, randomly sampled from the AAPM LDCT dataset (Moen et al., 2021) of 40 patients
Dataset Splits Yes Then, we sample 100 images from both the FFHQ and Celeb A-HQ validation sets for testing evaluation. For experiments on medical images, we fine-tune LDMs on 2000 2D computed tomography (CT) images... and test on the 300 2D CT images from the remaining 10 patients. For FBP-Unet and Pn P-UNet, we trained a model on 3480 2D CT from the training set.
Hardware Specification Yes All experiments are implemented in Py Torch on NVIDIA GPUs (A100 and A40). computed on a V100 GPU.
Software Dependencies No The paper mentions “Py Torch” but does not provide specific version numbers for PyTorch or any other software libraries, environments, or solvers used for reproducibility.
Experiment Setup Yes For T, we used T = 500 DDIM steps. For hard data consistency, we first split T into three even sub-intervals. ... We set τ = 10 4, which seemed to give us the best results for noisy inverse problems, with a maximum number of iterations of 2000 for pixel optimization and 500 for latent optimization (whichever convergence criteria came first). For the variance hyperparameter σt in the stochastic resample step, we chose an adaptive schedule of σ2 t = γ 1 αt 1 as discussed in Section A. Generally, we see that γ = 40 returns the best results for experiments on natural images.