A Global Occlusion-Aware Approach to Self-Supervised Monocular Visual Odometry

Authors: Yao Lu, Xiaoli Xu, Mingyu Ding, Zhiwu Lu, Tao Xiang2260-2268

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the KITTI dataset show that our approach achieves new state-of-the-art in both pose estimation and depth recovery.
Researcher Affiliation Academia 1 Gaoling School of Artificial Intelligence, Renmin University of China, Beijing 100872, China 2 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing 100872, China 3 The University of Hong Kong, Pokfulam, Hong Kong, China 4 University of Surrey, Guildford, Surrey GU2 7XH, United Kingdom
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets Yes For single-view depth estimation, we select the popular KITTI raw dataset (Geiger et al. 2013) with the Eigen (Eigen, Puhrsch, and Fergus 2014) split and the pre-processing method in (Zhou et al. 2017) to remove static frames, as in (Yin and Shi 2018; Zou, Luo, and Huang 2018; Ranjan et al. 2019). This provides 39,810 monocular triplets for training and 4,424 for the test.
Dataset Splits Yes This provides 39,810 monocular triplets for training and 4,424 for the test.
Hardware Specification Yes Our model is trained with a Titan XP GPU for 60 epochs using the Adam optimizer.
Software Dependencies No The total deep learning framework is implemented on Py Torch (Paszke et al. 2019). No specific version number for PyTorch or other key software components is provided.
Experiment Setup Yes We set w1 = 0.1 and w2 = 0.5 in Eq. (19). Our model is trained with a Titan XP GPU for 60 epochs using the Adam optimizer. We take a learning rate of 10 4 for the first 35 epochs and reduce it to 10 5 for the remainder. We set the batch size to 12 and the input/output resolution to 640 192 unless otherwise specified.