Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs

Authors: qinpeng cui, yixuan liu, Xinyi Zhang, Qiqi Bao, Qingmin Liao, liwang Amd, Lu Tian, Zicheng Liu, Zhongdao Wang, Emad Barsoum

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results demonstrate that our proposed method achieves state-of-the-art performance on synthetic and real-world datasets, while notably requiring only 5 sampling steps.
Researcher Affiliation Collaboration 1Advanced Micro Devices Inc. 2Tsinghua University
Pseudocode Yes Appendix B Pseudocode
Open Source Code Yes Code: https://github.com/AMD-AIG-AIMA/Do SSR
Open Datasets Yes For training, we train our Do SSR using a variety of datasets including DIV2K [1], DIV8K [15], Flickr2K [43], and OST [47].
Dataset Splits Yes For synthetic data, we randomly crop 3K patches with a resolution of 512 512 from the DIV2K validation set [1]
Hardware Specification Yes The latency is calculated on the 4 SR task for 128 128 LR images with V100 GPU.
Software Dependencies No The paper mentions 'Stable Diffusion 2.1-base3' and 'Adam optimizer [20]' but does not provide specific software versions for key dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes The model is fine-tuned for 50k iterations using the Adam optimizer [20], with a batch size of 32 and a learning rate set to 5 10 5, on 512 512 resolution images.