Video Frame Prediction from a Single Image and Events
Authors: Juanjuan Zhu, Zhexiong Wan, Yuchao Dai
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our proposed model significantly outperforms the state-of-the-art frame-based and event-based VFP methods and has the fastest runtime. |
| Researcher Affiliation | Academia | Juanjuan Zhu*, Zhexiong Wan*, Yuchao Dai School of Electronics and Information, Northwestern Polytechnical University {juanjuanzhu2022, wanzhexiong}@mail.nwpu.edu.cn, daiyuchao@nwpu.edu.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://npucvr.github.io/VFPSIE/. |
| Open Datasets | Yes | Consistent with the setting of existing event-based VFI (Tulyakov et al. 2021), we pre-simulate the events of Vimeo90k septuplet dataset (Xue et al. 2019) and Go Pro dataset (Nah, Hyun Kim, and Mu Lee 2017), then train our model on Vimeo90k and evaluate on the Go Pro test set. For experiments with real-captured data, we choose the HS-ERGB dataset (Tulyakov et al. 2021) for evaluation, which records the data with 1280 720 resolution at 160 fps and contains diverse scenes. Besides, we also perform quantitative comparisons on DSEC (Gehrig et al. 2021), a dataset of events for driving scenarios. |
| Dataset Splits | No | The paper does not provide explicit training/validation/test split percentages or sample counts for a validation set, only mentions training on Vimeo90k and testing on Go Pro, and for ablations, training on Go Pro training set and evaluating on test set. |
| Hardware Specification | Yes | In this paper, we propose a lightweight network that can predict a 720P frame within 25ms on an RTX2080Ti GPU. All experiments are conducted with Py Torch. We employ an Adam W optimizer for 50 epochs training with batch size 4 on two NVIDIA RTX3090 GPUs. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with versions. |
| Experiment Setup | Yes | Implementation Details All experiments are conducted with Py Torch. We employ an Adam W optimizer for 50 epochs training with batch size 4 on two NVIDIA RTX3090 GPUs. The learning rate is decayed from 1 10 4 to 1 10 5 with a cosine learning rate scheduler. To obtain reliable motion priors, we first pretrain our model only under the supervision of task-oriented flow loss in the first 15 epochs, followed by training the model with full loss in the remaining 35 epochs. To augment the training data, we make vertical and horizontal flipping with 50% probability and crop 384 384 patches randomly. |