Hundred-Kilobyte Lookup Tables for Efficient Single-Image Super-Resolution
Authors: Binxiao Huang, Jason Chun Lok Li, Jie Ran, Boyu Li, Jiajun Zhou, Dahai Yu, Ngai Wong
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train the proposed HKLUT on the DIV2K dataset [Agustsson and Timofte, 2017], a popular dataset in the SR field. The DIV2K dataset provides 800 training images and 100 validation images with 2K resolution and the corresponding downsampled images. We use the widely adopted Peak Signal-to-Noise Ratio (PSNR) and structural similarity index (SSIM) [Wang et al., 2004] as evaluation metrics. Five well-known datasets: Set5 [Bevilacqua et al., 2012], Set14 [Zeyde et al., 2012], BSDS100 [Martin et al., 2001], Urban100 [Huang et al., 2015] and Manga109 [Matsui et al., 2017] are benchmarked. |
| Researcher Affiliation | Collaboration | Binxiao Huang1 , Jason Chun Lok Li1 , Jie Ran1 , Boyu Li1 , Jiajun Zhou1 , Dahai Yu2 , Ngai Wong1 1 The University of Hong Kong 2 TCL Corporate Research |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation is publicly available at: https://github.com/jasonli0707/hklut. |
| Open Datasets | Yes | We train the proposed HKLUT on the DIV2K dataset [Agustsson and Timofte, 2017], a popular dataset in the SR field. The DIV2K dataset provides 800 training images and 100 validation images with 2K resolution and the corresponding downsampled images. We use the widely adopted Peak Signal-to-Noise Ratio (PSNR) and structural similarity index (SSIM) [Wang et al., 2004] as evaluation metrics. Five well-known datasets: Set5 [Bevilacqua et al., 2012], Set14 [Zeyde et al., 2012], BSDS100 [Martin et al., 2001], Urban100 [Huang et al., 2015] and Manga109 [Matsui et al., 2017] are benchmarked. |
| Dataset Splits | Yes | The DIV2K dataset provides 800 training images and 100 validation images with 2K resolution and the corresponding downsampled images. |
| Hardware Specification | Yes | Before converting into LUTs, the network is trained with 800 training images in DIV2K for 200k iterations with a batch size of 16 on Nvidia RTX 3090 GPUs. We utilize the Adam optimizer [Kingma and Ba, 2014] (β1 = 0.9, β2 = 0.999 and = 1e 8) with the MSE loss to train the HKLUT. The initial learning rate is set to 5 10 4, which decays to one-tenth after 100k and 150k iterations, respectively. We randomly crop LR images into 48 48 patches as input and enhance the dataset by random rotation and flipping. The runtime is measured on an Intel Core i5-10505 CPU with 16GB RAM, averaged over 10 runs. Using the same settings as for calculating energy and peak memory, we compared the runtimes of FSRCNN and LUT-based approaches on both a desktop CPU and the Raspberry Pi 4 Model B. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer', 'Python code', 'memray', and 'CUDA' but does not provide specific version numbers for these software dependencies or libraries. |
| Experiment Setup | Yes | Before converting into LUTs, the network is trained with 800 training images in DIV2K for 200k iterations with a batch size of 16 on Nvidia RTX 3090 GPUs. We utilize the Adam optimizer [Kingma and Ba, 2014] (β1 = 0.9, β2 = 0.999 and = 1e 8) with the MSE loss to train the HKLUT. The initial learning rate is set to 5 10 4, which decays to one-tenth after 100k and 150k iterations, respectively. We randomly crop LR images into 48 48 patches as input and enhance the dataset by random rotation and flipping. |