Implicit Transformer Network for Screen Content Image Continuous Super-Resolution
Authors: Jingyu Yang, Sheng Shen, Huanjing Yue, Kun Li
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that the proposed ITSRN significantly outperforms several competitive continuous and discrete SR methods for both compressed and uncompressed SCIs. |
| Researcher Affiliation | Academia | 1School of Electrical and Information Engineering, Tianjin University 2College of Intelligence and Computing, Tianjin University {yjy, codyshen, huanjing.yue, lik}@tju.edu.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | https://github.com/codyshen0000/ITSRN |
| Open Datasets | Yes | To the best of our knowledge, there is still no SCI SR dataset for public usage. Therefore, we first build a dataset named SCI1K. It contains 1000 screenshots with various screen contents, including but not limited to web pages, game scenes, cartoons, slides, documents, etc. ... To evaluate the generalization of the trained model, besides our test set, we also test on two other screen content datasets constructed for image quality assessment, i.e., SCID (including 40 images with resolution 1280 720 ) [12] and SIQAD (including 20 images with resolution around 600 800)[13]. |
| Dataset Splits | Yes | Among them, 800 images with 1280 720 are used for training and validation. The other 200 images with resolution ranging from 1280 720 to 2560 1440 are used for testing. |
| Hardware Specification | Yes | We parallelly run our ITSRN-RDN on two Ge Force GTX 1080Ti GPU with mini-batch size 16 and it cost 2 days to reach convergence (about 500 epochs). |
| Software Dependencies | No | The Adam [34] optimizer is used with beta1=0.9 and beta2=0.999. All the parameters are initialized with He initialization and the whole network is trained end-to-end. Following [6], the learning rate starts with 1e 4 for all modules and decays in half every 200 epochs. |
| Experiment Setup | Yes | In the training phase, to simulate continuous magnification, the downsampling scale is sampled in a uniform distribution U(1, 4). We then randomly crop 48 48 patches from the LR images and augment them via flipping and rotation. Following [18], we utilize the ℓ1 distance between the reconstructed image and the ground truth as the loss function. The Adam [34] optimizer is used with beta1=0.9 and beta2=0.999. All the parameters are initialized with He initialization and the whole network is trained end-to-end. Following [6], the learning rate starts with 1e 4 for all modules and decays in half every 200 epochs. |