ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting
Authors: Zongsheng Yue, Jianyi Wang, Chen Change Loy
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that the proposed method obtains superior or at least comparable performance to current state-of-the-art methods on both synthetic and real-world datasets, even only with 15 sampling steps. |
| Researcher Affiliation | Academia | Zongsheng Yue Jianyi Wang Chen Change Loy S-Lab, Nanyang Technological University {zongsheng.yue,jianyi001,ccloy}@ntu.edu.sg |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and model are available at https://github.com/zsy OAOA/Res Shift. |
| Open Datasets | Yes | HR images with a resolution of 256 256 in our training data are randomly cropped from the training set of Image Net [50] following LDM [11]. |
| Dataset Splits | No | HR images with a resolution of 256 256 in our training data are randomly cropped from the training set of Image Net [50] following LDM [11]. We synthesize a testing dataset that contains 3000 images randomly selected from the validation set of Image Net [50] based on the commonly-used degradation model, i.e., y = (x k) +n, where k is the blurring kernel, n is the noise, y and x denote the LR image and HR image, respectively. ... We name this dataset as Image Net-Test for convenience. |
| Hardware Specification | Yes | Running time is tested on NVIDIA Tesla V100 GPU on the x4 (64 256) SR task. |
| Software Dependencies | No | The Adam [51] algorithm with the default settings of Py Torch [52] and a mini-batch size of 64 is used to train Res Shift. (No specific PyTorch version is mentioned) |
| Experiment Setup | Yes | HR images with a resolution of 256 256 in our training data are randomly cropped from the training set of Image Net [50] following LDM [11]. We synthesize the LR images using the degradation pipeline of Real ESRGAN [19]. The Adam [51] algorithm with the default settings of Py Torch [52] and a mini-batch size of 64 is used to train Res Shift. During training, we use a fixed learning rate of 5e-5 and update the weight parameters for 500K iterations. As for the network architecture, we employ the UNet structure in DDPM [2]. To increase the robustness of Res Shift to arbitrary image resolution, we replace the self-attention layer in UNet with the Swin Transformer [53] block. |