Rethinking Multi-Scale Representations in Deep Deraining Transformer

Authors: Hongming Chen, Xiang Chen, Jiyang Lu, Yufeng Li

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our model achieves consistent gains on five benchmarks. This paper makes the following contributions to the field: We rethink the multi-scale representations for single image deraining problem, and propose an effective end-to-end multi-input multi-output architecture to better facilitate rain removal in the richer scale space. We show that coupled representation modules can jointly learn the intra-scale content-aware features and gated fusion modules can be beneficial for the inter-scale spatialaware features, in order to help hierarchical modulation. We perform comprehensive experiments to demonstrate the effectiveness of our method against the state-of-the-art Transformer-based image deraining approahces.
Researcher Affiliation Academia Hongming Chen1, Xiang Chen2 , Jiyang Lu1, Yufeng Li1* 1 College of Electronic Information Engineering, Shenyang Aerospace University 2 School of Computer Science and Engineering, Nanjing University of Science and Technology {chenhongming,lujiyang1}@stu.sau.edu.cn, chenxiang@njust.edu.cn, liyufeng@sau.edu.cn
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No The training code and test model will be available to the public.
Open Datasets Yes We evaluate the performance of our model on five publicly rain streak datasets: Rain200L (Yang et al. 2017), Rain200H (Yang et al. 2017), DID-Data (Zhang and Patel 2018), DDN-Data (Fu et al. 2017), and SPA-Data (Wang et al. 2019).
Dataset Splits Yes Rain200L and Rain200H comprise 1,800 synthetic rainy images for training, along with 200 images designated for testing. DID-Data and DDN-Data comprise 12,000 and 12,600 synthetic images, featuring distinct rain directions and density levels. Each dataset includes 1,200 and 1,400 rainy images specifically designated for testing. In addition, SPA-Data is a large-scale real-world rain benchmark, encompassing 638,492 image pairs for training, alongside 1,000 image pairs designated for testing.
Hardware Specification Yes We run all of our experiments with batch size of 2 and patch size of 256 on one NVIDIA Ge Force RTX 4090 GPU (24G).
Software Dependencies No The paper mentions 'implemented in Py Torch framework' and 'using Adam optimizer' but does not specify version numbers for these software components.
Experiment Setup Yes During training, the proposed network is implemented in Py Torch framework using Adam optimizer with a learning rate of 2 10 4 to minimize Ltotal. The final learning rate is steadily decreased to 1 10 4 using the cosine annealing strategy (Loshchilov and Hutter 2016). For Rain200L, Rain200H, DID-Data and DDN-Data, 500 epochs are trained, while SPA-Data is trained for 5 epochs. For data augmentation, we also randomly adopt horizontal and vertical flips. In our model, we adopt a stack of 8 CRMs (i.e., N = 8 in Figure 3. We set the thresholds of SCTB in S1, S2, and S3 to 0.6, 0.7, and 0.8, respectively. The setting of GDFN in SCTB is consistent with (Zamir et al. 2022). We run all of our experiments with batch size of 2 and patch size of 256...