NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation

Authors: Yuanze Wang, Yichao Yan, Dianxi Shi, Wenhan Zhu, Jianqiang Xia, Tan Jeff, Songchang Jin, KE GAO, XIAOBO LI, Xiaokang Yang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments In this section, we first compare the proposed Ne RF-IBVS with state-of-the-art visual localization methods. Furthermore, our method extended to IBVS-based navigation without using custom markers and the depth sensor, and its effectiveness is verified in simulation experiments. 5.3 Ablation Studies In this section, we conduct ablation experiments on the 12-Scenes dataset to evaluate the effectiveness of the major design of the proposed Ne RF-IBVS.
Researcher Affiliation Collaboration 1Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University 2Intelligent Game and Decision Lab (IGDL), Beijing, China 3Tianjin Artificial Intelligence Innovation Center 4Alibaba Group
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that its source code is publicly available.
Open Datasets Yes Datasets. We conduct experiments on two public indoor benchmark datasets. The 7-Scenes [35] contains seven indoor scenes recorded by a Kinect V1 camera, the data includes RGB-D images, poses, and ground truth 3D models. The 12-Scenes [39] contains twelve indoor scenes with RGB-D images and poses.
Dataset Splits No The paper provides the number of training data for each dataset in Table 1 and Table 2 but does not explicitly specify the training/test/validation dataset splits, such as percentages or specific partitioning methods.
Hardware Specification Yes For training time, the Nerfacto and coordinate regression network are trained on one NVIDIA RTX3090 GPU for about 2 days.
Software Dependencies No The paper mentions using "Nerfacto" and "Super Point [14] with Super Glue [29]", but it does not provide specific version numbers for these software components or any other libraries.
Experiment Setup Yes Nerfacto was trained for 100,000 iterations, and all camera poses are only centralized, with no scale adjustment. The coordinate regression network is trained for 40 epochs with a learning rate of 0.001. We empirically set τ = 200 for the threshold of coordinate distance. In pose optimization, we use Super Point with Super Glue and their official weights to detect correspondences, set λ = 0.5, N1 = 100 for IBVS iterations.