FreqFormer: Frequency-aware Transformer for Lightweight Image Super-resolution

Authors: Tao Dai, Jianping Wang, Hang Guo, Jinmin Li, Jinbao Wang, Zexuan Zhu

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on public datasets demonstrate the superiority of our Freq Former over state-of-the-art SR methods in terms of both quantitative metrics and visual quality.
Researcher Affiliation Academia Tao Dai1,2 , Jianping Wang1 , Hang Guo3 , Jinmin Li3 , Jinbao Wang1,2, , Zexuan Zhu1,2, 1College of Computer Science and Software Engineering, Shenzhen University 2National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University 3Tsinghua Shenzhen International Graduate School, Tsinghua University
Pseudocode No The paper includes architectural diagrams (Figure 2) but does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code and models are available at https://github.com/JPWang-CS/FreqFormer.
Open Datasets Yes Two training datasets, DIV2K [Lim et al., 2017] and Flickr2K [Radu Timofte and Zhang, 2017], were used for model training.
Dataset Splits No The paper describes training datasets (DIV2K, Flickr2K) and benchmark testing datasets (Set5, Set14, BSD100, Urban100, Manga109), but does not explicitly provide details about a separate validation dataset split.
Hardware Specification Yes Additionally, the model was trained using the PyTorch toolkit on 4 NVIDIA 3090 GPUs.
Software Dependencies No The paper mentions “PyTorch toolkit” but does not provide specific version numbers for software dependencies.
Experiment Setup Yes In our training setup, the model was configured with a patch size of 64x64, and a batch size of 32. The training process comprised 500,000 iterations, with an initial learning rate of 2e-4. The learning rate was halved at specific milestones: [250K, 400K, 450K, 475K]. Data augmentation techniques, including random horizontal flipping, and rotations at 90, 180, and 270, were applied to the training set. For optimization, the Adam optimizer was employed with β1 = 0.9 and β2 = 0.99 to minimize the L1 loss.