Random Shuffle Transformer for Image Restoration

Authors: Jie Xiao, Xueyang Fu, Man Zhou, Hongjian Liu, Zheng-Jun Zha

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of Shuffle Former across a variety of image restoration tasks, including image denoising, deraining, and deblurring.
Researcher Affiliation Academia 1University of Science and Technology of China, Hefei, China. Correspondence to: Xueyang Fu <xyfu@ustc.edu.cn>.
Pseudocode Yes We provide python implementation for the random shuffle and corresponding inverse shuffle in Algorithm 1 and 2, which only involves rearrangement operations.
Open Source Code Yes Code is available at https://github.com/jiexiaou/ Shuffle Former.
Open Datasets Yes We perform the real noise removal experiment on SIDD (Abdelhamed et al., 2018) datasets. Image deraining experiments are performed on the realworld SPA-Data (Purohit et al., 2021)... We conduct deblurring experiments on four benchmark datasets, including two synthesized datasets (Go Pro (Nah et al., 2017) and HIDE (Shen et al., 2019)), and two realworld datasets (Real Blur-R (Rim et al., 2020) and Real Blur J (Rim et al., 2020)).
Dataset Splits Yes Following Uformer (Wang et al., 2022), Shuffle Formers employ a four-level encoder-decoder structure. The numbers of Shuffle Block are {1, 2, 8, 8} for level-1 to level-4 of Encoder and the blocks for Decoder are mirrored. The number of channel is set to 32 and the window size is 8 8. We train the network with Adam optimizer (β1 = 0.9, β2 = 0.999) with the initial learning rate 2 10 4 gradually reduced to 1 10 6 with the cosine annealing. The training samples are augmented by the horizontal flipping and rotation of 90 , 180 , or 270 .
Hardware Specification Yes We train Shuffle Formers using four TITAN Xp GPUs with batch size 8 on 256 256 image pairs. The training process lasts for 10 epochs.
Software Dependencies No The paper mentions using
Experiment Setup Yes The number of channel is set to 32 and the window size is 8 8. We train the network with Adam optimizer (β1 = 0.9, β2 = 0.999) with the initial learning rate 2 10 4 gradually reduced to 1 10 6 with the cosine annealing. The training samples are augmented by the horizontal flipping and rotation of 90 , 180 , or 270 . The number of samples used in Monte-Carlo averaging is 16, i.e., M = 16.