VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

Authors: Kaichen Zhou, Changhao Chen, Bing Wang, Muhamad Risqi U. Saputra, Niki Trigoni, Andrew Markham6165-6173

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our model is extensively evaluated on RGB-D datasets and the results prove the efficacy of our model. Extensive experiments on indoor and outdoor scenarios and systematic research into the robustness and ablation demonstrate the effectiveness of our proposed framework. Our proposed VMLoc framework is evaluated on two common public datasets: 7-Scenes (Shotton et al. 2013) and Oxford Robot Car (Maddern et al. 2017).
Researcher Affiliation Collaboration Kaichen Zhou1, Changhao Chen*2, Bing Wang1, Muhamad Risqi U. Saputra1, Niki Trigoni1, Andrew Markham1 1Department of Computer Science, University of Oxford, 2College of Intelligence Science, National University of Defense Technology
Pseudocode Yes Algorithm. 1 demonstrates the detailed algorithmic description of our proposed VMLoc. ... Algorithm 1 VMLoc algorithm
Open Source Code Yes The source code is available at https://github.com/Zalex97/VMLoc.
Open Datasets Yes Our proposed VMLoc framework is evaluated on two common public datasets: 7-Scenes (Shotton et al. 2013) and Oxford Robot Car (Maddern et al. 2017).
Dataset Splits Yes We split the data as training and testing set according to the official instruction. Oxford Robot Car Dataset contains multimodal data from car-mounted sensors, e.g., cameras, lidar, and GPS/IMU. We use the same data split of this dataset named LOOP and FULL as in (Brahmbhatt et al. 2018) and (Wang et al. 2019).
Hardware Specification Yes Our approach is implemented by using Py Torch. The model is trained and tested with an NVIDIA Titan V GPU.
Software Dependencies No The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with versions (e.g., Python version, CUDA version).
Experiment Setup Yes During the training process, both RGB images and depth maps are taken as the input, which are rescaled with the shortest side in the length of 256 pixels and normalized into the range of [ 1, 1]. In the case of VMLoc, the sampling number k is set to be 10. The batch size is set to be 64 and the Adam optimizer is used in the optimization process with the learning rate 5 10 5 and the weight decay rate 5 10 5. The training dropout rate is set to be 0.5 and the initialization balance weights are β0 = 3.0 and γ0 = 0.0.