A Diffusion Model with State Estimation for Degradation-Blind Inverse Imaging

Authors: Liya Ji, Zhefan Rao, Sinno Jialin Pan, Chenyang Lei, Qifeng Chen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments on three typical inverse imaging problems (both linear and non-linear), inpainting, deblurring, and JPEG compression restoration, have comparable results with the state-of-the-art methods.
Researcher Affiliation Academia 1 Hong Kong University of Science and Technology 2 The Chinese University of Hong Kong 3 CAIR, HKISI-CAS
Pseudocode Yes Algorithm 1: Training
Open Source Code No The paper does not explicitly state that source code for their methodology is released or provide a link to it.
Open Datasets Yes We evaluate our models on two standard datasets, FFHQ (Karras, Laine, and Aila 2019) and LSUN-Bedroom (Yu et al. 2015), which are comparable to current state-of-the-art models for inverse problems.
Dataset Splits Yes We evaluate our performance on a held-out dataset with 1000 images, which were randomly sampled from the validation datasets of FFHQ and LSUN-bedroom, respectively. The training pairs for each task in the FFHQ dataset and LSUN-bedroom are 1000 and 2000, respectively.
Hardware Specification Yes We train and evaluate our model on NVIDIA A100 GPU cards.
Software Dependencies No The paper mentions latent diffusion models and U-net, implying the use of deep learning frameworks like PyTorch, but does not specify software dependencies with version numbers.
Experiment Setup Yes The training pairs for each task in the FFHQ dataset and LSUN-bedroom are 1000 and 2000, respectively. We train the State Estimator from scratch and fine-tune the diffusion model. The total step is around 75k for FFHQ and 150k for LSUN-Bedroom with batch size 4. The learning rate is 0.001 for the State Estimator and 10 6 for the finetuning of the diffusion model. The hyperparameter eta for the diffusion model is set to 3.0 both for training and inference. The truncated step η equals 3.