Promoting Single-Modal Optical Flow Network for Diverse Cross-Modal Flow Estimation

Authors: Shili Zhou, Weimin Tan, Bo Yan3562-3570

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments demonstrate that our method is effective on multiple datasets of different crossmodal scenarios.
Researcher Affiliation Academia School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing Fudan University, Shanghai, China
Pseudocode No The paper describes the proposed framework and models using text and diagrams, but does not include any explicit pseudocode blocks or algorithms.
Open Source Code No The paper mentions utilizing 'open-source PWC-Net (Sun et al. 2018) and RAFT (Teed and Deng 2020) code and weights', but does not state that the authors' own proposed framework or models (MPF, CMA, Cross RAFT) are open-source or provide any link to their code.
Open Datasets Yes We use Youtube VOS dataset (Xu et al. 2018) as the training set. ... Three datasets are used to evaluate models for crossmodal flow estimation. They are RGBNIR-Stereo (Zhi et al. 2018), Tri Modal Human (Palmero et al. 2016) and a dataset synthesized by ourselves named Cross KITTI, which will be described in detail when introducing the experiment results. ... The KITTI2012 dataset (Geiger, Lenz, and Urtasun 2012) is collected with a set of in-vehicle sensors, and has sparse optical flow annotations for some real-world frames. We randomly select 59 image pairs with flow annotations in KITTI-2012 to construct our Cross KITTI.
Dataset Splits No The paper mentions using 'Youtube VOS dataset as the training set' and evaluating on 'RGBNIR-Stereo, Tri Modal Human, and Cross KITTI' but does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits).
Hardware Specification Yes All experiments are conducted on a single NVIDIA RTX2080Ti GPU with a Intel Core i7-9700K@3.60GHz CPU (32G RAM).
Software Dependencies No The paper states: 'We implement our framework and models in Py Torch.' and 'The data augmentations is implemented with Albumentations (Buslaev et al. 2020).' It also mentions using 'Adam W (Loshchilov and Hutter 2017) optimizer'. However, it does not provide specific version numbers for Py Torch, Adam W, or Albumentations, which are required for reproducibility.
Experiment Setup Yes We use Adam W (Loshchilov and Hutter 2017) optimizer with learning-rate=0.00002 and weightdecay=0.00005 to train the models for 10k steps with batchsize=4.