reproducibilityindex.ai

Offline and Online Optical Flow Enhancement for Deep Video Compression

Authors: Chuanbo Tang, Xihua Sheng, Zhuoyuan Li, Haotian Zhang, Li Li, Dong Liu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on two state-of-the-art deep video compression schemes, DCVC and DCVCDC. Experimental results demonstrate that the proposed ofﬂine and online enhancement together achieves on average 13.4% bitrate saving for DCVC and 4.1% bitrate saving for DCVC-DC on the tested videos, without increasing the model or computational complexity of the decoder side.
Researcher Affiliation	Academia	University of Science and Technology of China {cbtang,xhsheng,zhuoyuanli,zhanghaotian}@mail.ustc.edu.cn, {lil1,dongeliu}@ustc.edu.cn
Pseudocode	Yes	Algorithm 1: Optical Flow Latent Updating in the Inference Stage
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described, nor does it contain an explicit statement of code release or a link to a repository.
Open Datasets	Yes	We use BVI-DVC (Ma, Zhang, and Bull 2021) dataset for ﬁne-tuning Spynet. ... The commonly-used Vimeo-90k (Xue et al. 2019) dataset is used for training DCVC and DCVC-DC in an end-to-end manner.
Dataset Splits	No	The paper mentions 'all the videos of training sets are randomly cropped into 256 × 256 patches' for training and 'We test 96 frames for each video' but does not specify a quantitative training/test/validation dataset split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	Yes	The motion vectors are extracted by VTM10.01. The Adam optimizer (Kingma and Ba 2014) is used. Cheng2020Anchor (Cheng et al. 2020) implemented by Compress AI (B egaint et al. 2020).
Experiment Setup	Yes	In the ﬁrst stage, we set λME to 100, and ﬁne-tune the Spynet using the extracted MV for 1,000,000 iterations. In the second stage, we deploy the enhanced Spynet into the video codec and train the whole video compression network for 5,000,000 iterations until converge. Finally, we set the updating times N in Algorithm 1 to 1500 according to the ablation study. The initial learning rate for the ﬁrst two steps is 1e-4, then decrease to 5e-5 at the 800,000th iteration and 4,000,000th iteration respectively. The initial learning rate for online optimization is 5e-3, which is decreased by 50% at the 1200th iteration. The Adam optimizer (Kingma and Ba 2014) is used, and the batch size is set to 16 for the ﬁrst training stage and 4 for the second training stage.