reproducibilityindex.ai

Implicit Transformer Network for Screen Content Image Continuous Super-Resolution

Authors: Jingyu Yang, Sheng Shen, Huanjing Yue, Kun Li

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that the proposed ITSRN signiﬁcantly outperforms several competitive continuous and discrete SR methods for both compressed and uncompressed SCIs.
Researcher Affiliation	Academia	1School of Electrical and Information Engineering, Tianjin University 2College of Intelligence and Computing, Tianjin University {yjy, codyshen, huanjing.yue, lik}@tju.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	https://github.com/codyshen0000/ITSRN
Open Datasets	Yes	To the best of our knowledge, there is still no SCI SR dataset for public usage. Therefore, we ﬁrst build a dataset named SCI1K. It contains 1000 screenshots with various screen contents, including but not limited to web pages, game scenes, cartoons, slides, documents, etc. ... To evaluate the generalization of the trained model, besides our test set, we also test on two other screen content datasets constructed for image quality assessment, i.e., SCID (including 40 images with resolution 1280 720 ) [12] and SIQAD (including 20 images with resolution around 600 800)[13].
Dataset Splits	Yes	Among them, 800 images with 1280 720 are used for training and validation. The other 200 images with resolution ranging from 1280 720 to 2560 1440 are used for testing.
Hardware Specification	Yes	We parallelly run our ITSRN-RDN on two Ge Force GTX 1080Ti GPU with mini-batch size 16 and it cost 2 days to reach convergence (about 500 epochs).
Software Dependencies	No	The Adam [34] optimizer is used with beta1=0.9 and beta2=0.999. All the parameters are initialized with He initialization and the whole network is trained end-to-end. Following [6], the learning rate starts with 1e 4 for all modules and decays in half every 200 epochs.
Experiment Setup	Yes	In the training phase, to simulate continuous magniﬁcation, the downsampling scale is sampled in a uniform distribution U(1, 4). We then randomly crop 48 48 patches from the LR images and augment them via ﬂipping and rotation. Following [18], we utilize the ℓ1 distance between the reconstructed image and the ground truth as the loss function. The Adam [34] optimizer is used with beta1=0.9 and beta2=0.999. All the parameters are initialized with He initialization and the whole network is trained end-to-end. Following [6], the learning rate starts with 1e 4 for all modules and decays in half every 200 epochs.