Continuous Parametric Optical Flow

Authors: Jianqin Luo, Zhexiong Wan, yuxin mao, Bo Li, Yuchao Dai

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments This section consists of four associated parts. We first use Kubric [11] to simulate a relatively dense and longer pixel-tracking dataset for our model training and as a basic benchmark for evaluation. Second, to establish valid baselines, we introduce two approaches [9, 12] to provide direct estimation for sampling moments and parametric assumption to simulate inter-frame motion. Third, we set error metrics for quantitative comparison and show the results on synthetic scenes and real-world scenes. Finally, ablation experiments are implemented for module validity tests.
Researcher Affiliation Academia Northwestern Polytechnical University, Xi an, China Shaanxi Key Laboratory of Information Acquisition and Processing
Pseudocode No The paper describes its methods using mathematical equations and textual explanations, but does not provide formal pseudocode or algorithm blocks.
Open Source Code Yes Project page: https://npucvr.github.io/CPFlow .
Open Datasets No The paper states they *create* a dataset using Kubric, but does not provide specific access information (link, DOI, repository) for *their generated dataset*.
Dataset Splits No The paper mentions a 'training stage' and using 'two subsets... for validation' (Query-Stride for training, Query-First for evaluation/dense evaluation), but it does not specify explicit percentages or sample counts for training, validation, and test dataset splits.
Hardware Specification Yes We train our model on four RTX 3090 GPUs with a batch size of 8.
Software Dependencies No The paper describes the use of certain modules and a 'one-cycle schedule', but it does not specify version numbers for any programming languages, libraries, or frameworks used (e.g., Python, PyTorch, CUDA).
Experiment Setup Yes We train our model on four RTX 3090 GPUs with a batch size of 8. At the training stage, we use the resolution of 256 256 and randomly sample 20480 point trajectories as sparse supervisions for better performance. We train for 30,000 steps because more iterations may result in degradation on diverse timescales with a learning rate of 2e-5 and a one-cycle schedule [52]. We repeat multiple times and report median results. We set the numbers of control points N = 6 while the degree k = 3, and the sampling number of frames M = 4. The number Q of correlation is 3, and the pooling level Z is 3. In the supervision process, the Ngt = 8 and γ = 0.8. For temporal augmentation, the length of video clips L is variable but at least 8 frames.