Random Shuffle Transformer for Image Restoration
Authors: Jie Xiao, Xueyang Fu, Man Zhou, Hongjian Liu, Zheng-Jun Zha
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of Shuffle Former across a variety of image restoration tasks, including image denoising, deraining, and deblurring. |
| Researcher Affiliation | Academia | 1University of Science and Technology of China, Hefei, China. Correspondence to: Xueyang Fu <xyfu@ustc.edu.cn>. |
| Pseudocode | Yes | We provide python implementation for the random shuffle and corresponding inverse shuffle in Algorithm 1 and 2, which only involves rearrangement operations. |
| Open Source Code | Yes | Code is available at https://github.com/jiexiaou/ Shuffle Former. |
| Open Datasets | Yes | We perform the real noise removal experiment on SIDD (Abdelhamed et al., 2018) datasets. Image deraining experiments are performed on the realworld SPA-Data (Purohit et al., 2021)... We conduct deblurring experiments on four benchmark datasets, including two synthesized datasets (Go Pro (Nah et al., 2017) and HIDE (Shen et al., 2019)), and two realworld datasets (Real Blur-R (Rim et al., 2020) and Real Blur J (Rim et al., 2020)). |
| Dataset Splits | Yes | Following Uformer (Wang et al., 2022), Shuffle Formers employ a four-level encoder-decoder structure. The numbers of Shuffle Block are {1, 2, 8, 8} for level-1 to level-4 of Encoder and the blocks for Decoder are mirrored. The number of channel is set to 32 and the window size is 8 8. We train the network with Adam optimizer (β1 = 0.9, β2 = 0.999) with the initial learning rate 2 10 4 gradually reduced to 1 10 6 with the cosine annealing. The training samples are augmented by the horizontal flipping and rotation of 90 , 180 , or 270 . |
| Hardware Specification | Yes | We train Shuffle Formers using four TITAN Xp GPUs with batch size 8 on 256 256 image pairs. The training process lasts for 10 epochs. |
| Software Dependencies | No | The paper mentions using |
| Experiment Setup | Yes | The number of channel is set to 32 and the window size is 8 8. We train the network with Adam optimizer (β1 = 0.9, β2 = 0.999) with the initial learning rate 2 10 4 gradually reduced to 1 10 6 with the cosine annealing. The training samples are augmented by the horizontal flipping and rotation of 90 , 180 , or 270 . The number of samples used in Monte-Carlo averaging is 16, i.e., M = 16. |