Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Zero-shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model
Authors: Cong Cao, Huanjing Yue, Xin Liu, Jingyu Yang
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the superiority of our proposed method. ... We utilize six metrics to evaluate the restoration and enhancement quality. Besides the commonly used metrics PSNR, SSIM, and FID, we utilize Warping Error (WE) (Lai et al. 2018), Frame Similarity (FS) (Wu et al. 2023; Chen et al. 2023; Qi et al. 2023), and optical flow map error (OFME) (Wang et al. 2024; Chen et al. 2023) to evaluate temporal consistency. ... In this section, we perform an ablation study to demonstrate the effectiveness of the proposed SLR Temporal Attention, Temporal Consistency Guidance, Spatial-Temporal Noise Sharing, and Early Stopping Sampling Strategy. |
| Researcher Affiliation | Academia | 1School of Electrical and Information Engineering, Tianjin University, Tianjin, China 2Computer Vision and Pattern Recognition Laboratory, School of Engineering Science, Lappeenranta-Lahti University of Technology LUT, Lappeenranta, Finland caocong EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methods through text and diagrams (Fig. 1, Fig. 2) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/cao-cong/ZVRD |
| Open Datasets | Yes | For video super-resolution, we collected 18 gt videos from commonly used test datasets REDS4 (Nah et al. 2019), Vid4 (Liu and Sun 2013), and UDM10 (Yi et al. 2019). For video deblurring, we collected 10 ground truth (GT) videos from the dataset REDS (Nah et al. 2019). For video denoising, we collected 15 GT videos from the commonly used test dataset Set8 (Tassano, Delon, and Veit 2020) and DAVIS (Pont-Tuset et al. 2017). For video inpainting, we collected 20 GT videos from the commonly used DAVIS (Pont-Tuset et al. 2017) dataset. For video colorization, we use the GT videos from the Videvo20 (Lai et al. 2018) dataset... For low-light video enhancement, we collected 10 paired low-normal videos from the DID dataset (Fu et al. 2023). |
| Dataset Splits | Yes | For video super-resolution, we collected 18 gt videos from commonly used test datasets REDS4 (Nah et al. 2019), Vid4 (Liu and Sun 2013), and UDM10 (Yi et al. 2019). For video deblurring, we collected 10 ground truth (GT) videos from the dataset REDS (Nah et al. 2019). For video denoising, we collected 15 GT videos from the commonly used test dataset Set8 (Tassano, Delon, and Veit 2020) and DAVIS (Pont-Tuset et al. 2017). For video inpainting, we collected 20 GT videos from the commonly used DAVIS (Pont-Tuset et al. 2017) dataset. For video colorization, we use the GT videos from the Videvo20 (Lai et al. 2018) dataset... For low-light video enhancement, we collected 10 paired low-normal videos from the DID dataset (Fu et al. 2023). We utilize six metrics to evaluate the restoration and enhancement quality. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | In practice, we only apply self-corrected trajectory attention after the current diffusion step t<TT A, where TT A is set to 100 for the GDP backbone. ... Since only in the second half of sampling, x0 is suitable to compute optical flow, we apply pixel-level consistency guidance after the current diffusion step t<TT C, which is set to 300 for the GDP backbone. ... Then we apply gradient guidance (Fei et al. 2023) to guide the sampling process. Specifically, we sample xt 1 by N ยต + s x0LT C x0 , ฯ2 , s is gradient scale. |