Recurrent Partial Kernel Network for Efficient Optical Flow Estimation

Authors: Henrique Morimitsu, Xiaobin Zhu, Xiangyang Ji, Xu-Cheng Yin

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on public benchmarks show that we achieve state-of-the-art generalization performance while requiring significantly fewer parameters and memory than competing methods. Our model ranks first in the Spring benchmark without finetuning, improving the results by over 10% while requiring an order of magnitude fewer FLOPs and over four times less memory than the following published method without finetuning.
Researcher Affiliation Academia 1 School of Computer and Communication Engineering, University of Science and Technology Beijing, China 2 Department of Automation, Tsinghua University, China
Pseudocode No The paper includes figures illustrating the model architecture and mathematical equations but no pseudocode or algorithm blocks.
Open Source Code Yes The code is available at github. com/hmorimitsu/ptlflow/tree/main/ptlflow/models/rpknet.
Open Datasets Yes We use the extended training (Sun et al. 2022a; Xu et al. 2022) with 100k iterations on the Flying Chairs dataset (Dosovitskiy et al. 2015) followed by 1M iterations on the Flying Things3D (Mayer et al. 2016). For the Sintel (Butler et al. 2012) and Spring (Mehl et al. 2023) benchmarks, we use 250k iterations to finetune the model in a mixed dataset combining Flying Things3D, KITTI (Geiger, Lenz, and Urtasun 2012; Menze and Geiger 2015), HD1K (Kondermann et al. 2016), and Sintel samples.
Dataset Splits Yes We use the extended training (...) For the Sintel (...) and Spring (...) benchmarks, we use 250k iterations to finetune the model in a mixed dataset combining Flying Things3D, KITTI (...), HD1K (...), and Sintel samples. For KITTI, we start from the Sintel model and further finetune it in the KITTI 2015 dataset for 5k iterations. We use 12 refinement iterations for training and ablation experiments and 32 for the public benchmark evaluation. Table 1: Results with the official metrics on KITTI 2015 (Fl-All), MPI-Sintel (EPE), and Spring (1px) datasets. Train results are collected from models without finetuning, while test results come from the official benchmarks.
Hardware Specification Yes Each model is trained once using 1234 as the random seed on two NVIDIA RTX3090 GPUs.
Software Dependencies No The paper mentions 'PyTorch profiler' but does not specify any software dependencies with their version numbers required for reproduction.
Experiment Setup Yes We follow the same training routine as RAFT (Teed and Deng 2020), using the Adam W optimizer (Loshchilov and Hutter 2019) combined with the One Cycle learning rate schedule (Smith and Topin 2019). We use the extended training (...) with 100k iterations (...) followed by 1M iterations (...). For the Sintel (...) and Spring (...) benchmarks, we use 250k iterations to finetune the model in a mixed dataset (...). For KITTI, we start from the Sintel model and further finetune it (...) for 5k iterations. We use 12 refinement iterations for training and ablation experiments and 32 for the public benchmark evaluation. Each model is trained once using 1234 as the random seed.