Efficient Non-local Contrastive Attention for Image Super-resolution

Authors: Bin Xia, Yucheng Hang, Yapeng Tian, Wenming Yang, Qingmin Liao, Jie Zhou2759-2767

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results show that ENLCN reaches superior performance over state-of-the-art approaches on both quantitative and qualitative evaluations. Experiments, Datasets and Evaluation Metrics, Implementation Details, Comparisons with State-of-the-arts, Ablation Study.
Researcher Affiliation Academia 1 Shenzhen International Graduate School / Department of Electronic Engineering, Tsinghua University 2 University of Rochester 3 Department of Automation, Tsinghua University
Pseudocode No The paper describes its methods in text and diagrams but does not include explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not include an explicit statement about releasing its source code, nor does it provide a link to a code repository.
Open Datasets Yes Following EDSR (Lim et al. 2017) and RNAN (Zhang et al. 2019), we use DIV2K (Timofte et al. 2017), a dataset consists of 800 training images, to train our models. We test our method on 5 standard SISR benchmarks: Set5 (Bevilacqua et al. 2012), Set14 (Zeyde, Elad, and Protter 2010), B100 (Martin et al. 2001), Urban100 (Huang, Singh, and Ahuja 2015b) and Manga109 (Matsui et al. 2017).
Dataset Splits No The paper states using DIV2K for training and specific benchmarks for testing, but it does not explicitly provide details about a validation dataset split or how it was used for hyperparameter tuning.
Hardware Specification Yes The model is implemented with Py Torch and trained on Nvidia 2080ti GPUs.
Software Dependencies No The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with their versions.
Experiment Setup Yes For ENLCA, we regenerate Gaussian random matrix F every epoch. Additionally, amplification factor k is 6, and margin b is 1. The number of random samples m is set to 128. We build ENLCN using EDSR backbone with 32-residual blocks and 5 additional ENLCA blocks. All convolutional kernel size in the network is 3 3. All intermediate features have 256 channels except for those embedded features in the attention module having 64 channels. The last convolution layer has 3 filters to transform the feature map into a 3-channel RGB image. During training, we set n1 and n2 for contrastive learning to 2% and 8%, separately. Besides, we randomly crop 28 28 and 46 46 patches from 16 images to form an input batch for 4 and 2 SR, respectively. We augment the training patches by randomly horizontal flipping and rotating 90 , 180 , 270 . The model is optimized by ADAM optimizer (Kingma and Ba 2014) with β1 = 0.9, β2 = 0.99 and initial learning rate of 1e-4. We reduce the learning rate by 0.5 after 200 epochs and obtain the final model after 1000 epochs.