Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

QBasicVSR: Temporal Awareness Adaptation Quantization for Video Super-Resolution

Authors: Zhenwei Zhang, Fanhua Shang, Hongying Liu, Liang Wan, Wei Feng, Yanming Hui

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method achieves extraordinary performance with state-of-the-art efficient VSR approaches, delivering up to 200 faster processing speed while utilizing only 1/8 of the GPU resources. Additionally, extensive experiments demonstrate that the proposed method significantly outperforms existing PTQ algorithms on various datasets. For instance, it attains a 2.53 d B increase on the UDM10 benchmark when quantizing Basic VSR to 4-bit with 100 unlabeled video clips.
Researcher Affiliation Academia Zhenwei Zhang1, Fanhua Shang1 , Hongying Liu2, 3 , Liang Wan2, 4, Wei Feng1, Yanming Hui1 1School of Computer Science and Technology, Tianjin University 2Medical School, Tianjin University 3Peng Cheng Lab, Shenzhen 4School of Computer Software, Tianjin University EMAIL
Pseudocode Yes Algorithm 1 Temporal Awareness Adaptation Quantization for Video Super-Resolution
Open Source Code Yes To promote collaborative studies and industrial deployments, we also provide QBasic VSR, a novel quantization VSR library. This new library includes the complete implementation, such as training protocols, quantization-aware modules, and pre-trained model weights. ... We have provided code to reproduce the results in this work.
Open Datasets Yes We build the calibration datasets by randomly sampling 100 LR video clips from the REDS [33] and Vimeo-90K [44]. For REDS, we use REDS4, containing four clips, as our test set. In addition, we utilize Vid4 [30], UDM10 [45], and Vimeo-90K-T [44] as test sets along with Vimeo-90K.
Dataset Splits Yes We build the calibration datasets by randomly sampling 100 LR video clips from the REDS [33] and Vimeo-90K [44]. For REDS, we use REDS4, containing four clips, as our test set. In addition, we utilize Vid4 [30], UDM10 [45], and Vimeo-90K-T [44] as test sets along with Vimeo-90K. ... REDS [33] is a standard proposed high-quality video dataset... It includes 270 clips that serve both training and validation purposes. Following the setting in [33], we use REDS4, which consists of four representative clips (000, 011, 015 and 020), as the evaluation set, while the remaining 266 clips are used for training. ... Vimeo-90K [44] is a widely used video dataset... It provides 64,612 clips for training and 7,824 clips for testing, with the test set referred to as Vimeo-90K-T.
Hardware Specification Yes As shown in Table 4, the SOTA efficient Basic VSR method [42] requires training 303,380 iterations with a batch size of 8, which takes approximately 15 days on 8 NVIDIA A6000 GPUs. ... The processing time is measured on A6000 GPUs. ... We present CPU evaluation across three benchmark datasets... running CPU-only inference on both Intel Xeon Gold 5218R (x86; Linux; 40 threads used) and Apple M1 Pro (ARM; mac OS; 8 threads used).
Software Dependencies No The paper mentions the MATLAB imresize function and Adam optimizer [24], but does not provide specific version numbers for these or other key software components like programming languages, frameworks (e.g., PyTorch), or CUDA versions.
Experiment Setup Yes The parameters γ and λ in the flow-gradient complexity metric are set to 200 and 10. The hyperparameters for calibrating the bit adaptation modules, p V , pspace, and ptemp are set to 10, 30 and 30. The constant in the reconstruction loss is set to 10 6. The parameter λskt in total loss is set to 0.1. After freezing the network weights, we sequentially fine-tune individual components for one epoch each: optimizing weight clipping ranges in the first epoch, activation clipping ranges in the second epoch, and bit adaptation module parameters in the third epoch. Each stage uses the Adam optimizer [24] with a batch size of 2. The initial learning rates are set to 1 10 3 for weight clipping ranges, 1 10 5 for activation clipping ranges, and 0.1 for both temporal shared layer bit adaptation and flow-gradient video bit adaptation modules. Each learning rate is decayed by 0.9 every epoch.