SAVSR: Arbitrary-Scale Video Super-Resolution via a Learned Scale-Adaptive Network

Authors: Zekun Li, Hongying Liu, Fanhua Shang, Yuanyuan Liu, Liang Wan, Wei Feng

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments at various scales on the benchmark datasets show that the proposed SAVSR outperforms state-of-the-art (SOTA) methods at non-integer and asymmetric scales.
Researcher Affiliation Academia 1School of Artificial Intelligence, Xidian University, China 2Medical College, Tianjin University, Tianjin, China 3College of Intelligence and Computing, Tianjin University, Tianjin, China 4Peng Cheng Lab, Shenzhen, China
Pseudocode No The paper describes the network architecture and modules in text and diagrams (e.g., Figure 2), but it does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes The source code is available at https://github.com/Weepingchestnut/SAVSR.
Open Datasets Yes We use the training set from Vimeo-90K (Xue et al. 2019) dataset which contains over 9000 training and testing video sequences.
Dataset Splits No The paper states it uses the training set of Vimeo-90K for training and Vid4 and UDM10 for evaluation, but it does not explicitly provide details about a validation split used for hyperparameter tuning or early stopping during training, especially not from the Vimeo-90K dataset itself.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Following the arbitrary-scale SISR works (Hu et al. 2019; Wang et al. 2021; Chen, Liu, and Wang 2021), for some non-integer scales we crop the frame borders to ensure that downsampling these scales does not result in fractional resolution, e.g., a 576 × 704 frame must be cropped to 575 × 700 when downsampling by 2.5, to make the LR frame with an integer resolution 230 × 280. Considering the trade-off between performance and complexity, we set the iteration window size to 7 and the sliding window size to 3 for the Vimeo90K dataset.