Cross Aggregation Transformer for Image Restoration
Authors: Zheng Chen, Yulun Zhang, Jinjin Gu, yongbing zhang, Linghe Kong, Xin Yuan
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our CAT outperforms recent state-of-the-art methods on several image restoration applications. The code and models are available at https://github.com/zhengchen1999/CAT. |
| Researcher Affiliation | Academia | 1Shanghai Jiao Tong University, 2ETH Zürich, 3Shanghai AI Laboratory, 4The University of Sydney, 5Harbin Institute of Technology (Shenzhen), 6Westlake University |
| Pseudocode | No | The paper describes its methods and architecture using text and diagrams but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and models are available at https://github.com/zhengchen1999/CAT. |
| Open Datasets | Yes | For image SR, we choose DIV2K [37] and Flickr2K [22] as the training data. For JPEG compression artifact reduction, training set consists of DIV2K [37], Flickr2K [22], BSD500 [2], and WED [26]. For real image denoising, we train CAT on SIDD [1] dataset. |
| Dataset Splits | Yes | For real image denoising, we train CAT on SIDD [1] dataset. And we have two testing datasets: SIDD validation set [1] and DND [33]. |
| Hardware Specification | Yes | We use Py Torch [32] to implement our models with 4 Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions using 'Py Torch [32]' but does not specify its version or other software dependencies with version numbers. |
| Experiment Setup | Yes | We set the residual group (RG) number as N1=6 and the cross aggregation Transformer block (CATB) number as N2=6 for each RG. The channel dimension, attention head number, and MLP expansion ratio for each CATB are set as 180, 6, and 4, respectively. For image SR, we train the model with batch size 32, where each input image is randomly cropped to 64x64 size, and the total training iterations are 500K. We adopt Adam optimizer [18] with β1=0.9 and β2=0.99 to minimize the L1 loss... The initial learning rate is set as 2x10^-4 and reduced by half at the milestone [250K,400K,450K,475K]. |