Efficient Non-local Contrastive Attention for Image Super-resolution
Authors: Bin Xia, Yucheng Hang, Yapeng Tian, Wenming Yang, Qingmin Liao, Jie Zhou2759-2767
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results show that ENLCN reaches superior performance over state-of-the-art approaches on both quantitative and qualitative evaluations. Experiments, Datasets and Evaluation Metrics, Implementation Details, Comparisons with State-of-the-arts, Ablation Study. |
| Researcher Affiliation | Academia | 1 Shenzhen International Graduate School / Department of Electronic Engineering, Tsinghua University 2 University of Rochester 3 Department of Automation, Tsinghua University |
| Pseudocode | No | The paper describes its methods in text and diagrams but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an explicit statement about releasing its source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Following EDSR (Lim et al. 2017) and RNAN (Zhang et al. 2019), we use DIV2K (Timofte et al. 2017), a dataset consists of 800 training images, to train our models. We test our method on 5 standard SISR benchmarks: Set5 (Bevilacqua et al. 2012), Set14 (Zeyde, Elad, and Protter 2010), B100 (Martin et al. 2001), Urban100 (Huang, Singh, and Ahuja 2015b) and Manga109 (Matsui et al. 2017). |
| Dataset Splits | No | The paper states using DIV2K for training and specific benchmarks for testing, but it does not explicitly provide details about a validation dataset split or how it was used for hyperparameter tuning. |
| Hardware Specification | Yes | The model is implemented with Py Torch and trained on Nvidia 2080ti GPUs. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with their versions. |
| Experiment Setup | Yes | For ENLCA, we regenerate Gaussian random matrix F every epoch. Additionally, amplification factor k is 6, and margin b is 1. The number of random samples m is set to 128. We build ENLCN using EDSR backbone with 32-residual blocks and 5 additional ENLCA blocks. All convolutional kernel size in the network is 3 3. All intermediate features have 256 channels except for those embedded features in the attention module having 64 channels. The last convolution layer has 3 filters to transform the feature map into a 3-channel RGB image. During training, we set n1 and n2 for contrastive learning to 2% and 8%, separately. Besides, we randomly crop 28 28 and 46 46 patches from 16 images to form an input batch for 4 and 2 SR, respectively. We augment the training patches by randomly horizontal flipping and rotating 90 , 180 , 270 . The model is optimized by ADAM optimizer (Kingma and Ba 2014) with β1 = 0.9, β2 = 0.99 and initial learning rate of 1e-4. We reduce the learning rate by 0.5 after 200 epochs and obtain the final model after 1000 epochs. |