Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-Frame Deformable Look-Up Table for Compressed Video Quality Enhancement

Authors: Gang He, Guancheng Quan, Chang Wu, Shihao Wang, Dajiang Zhou, Yunsong Li

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that our proposed method dramatically outperforms the state-of-the-art LUT-based methods, and obtains competitive performance compared to CNNbased methods with the capability to run in real-time(30fps) at 1080p resolution. We quantitatively and qualitatively evaluate the performance of our proposed model on 18 HEVC Test Sequences, which outperforms existing LUT-based models and rivals multi-frame CNN models with less computational cost.
Researcher Affiliation Collaboration Gang He1, Guancheng Quan1,*, Chang Wu1, Shihao Wang2, Dajiang Zhou2, Yunsong Li1 1Xidian University, Shaanxi, China 2Ant Group, Zhejiang, China
Pseudocode No The paper describes the proposed method using figures, mathematical equations, and textual descriptions of modules (Motion Feature Modulation Network, Temporal Feature Extraction Module, Multi-Scale Fusion Module) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper discusses implementing the method on Pytorch and using DCNv2, but it does not contain an explicit statement about releasing its own source code, nor does it provide a link to a code repository.
Open Datasets Yes For model training, we selected CVCP (Chen et al. 2021) dataset. For model testing, 18 standard test sequences (Ohm et al. 2012) with 150 to 600 frames per video from JCT-VC are utilized for evaluation.
Dataset Splits No The paper specifies using the CVCP dataset for training and 18 standard test sequences from JCT-VC for evaluation. It mentions "random crop of 256 256 to the input and set the batchsize to 8" as a training strategy. However, it does not explicitly provide specific percentages, sample counts, or predefined splits for training, validation, or testing, only that different datasets are used for training and testing.
Hardware Specification Yes All experiments are conducted in Py Torch2.1 with GPU V100s.
Software Dependencies Yes We constructed our method on Pytorch (Paszke et al. 2019) and the deformable convolution is based on DCNv2 (Zhu et al. 2019). All experiments are conducted in Py Torch2.1 with GPU V100s.
Experiment Setup Yes We apply random crop of 256 256 to the input and set the batchsize to 8. We divide the training of the model into three stages: complete CNN training, sampling finetuning, and LUT-Convert finetuning. In each stage, we use Mean Square Error (MSE) loss to constrain the difference between the enhanced and original frames. ... Firstly, we utilize Adam (Kingma and Ba 2014) optimiser for CNN training on QP37 dataset, and set the initial learning rate as 10 4. At the end of training, we utilize the trained model weights to finetune on datasets with QP of 22, 27 and 32, respectively. ... In order to improve the stability of the converted LUTs, we adopt the LUT-aware Finetuning Strategy proposed by Mu LUT. During the evaluation phase, we improve the efficient 4DLUT interpolation method (Yin et al. 2023) based on CUDA to obtain high efficiency.