DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

Authors: Yueming Xu, Haochen Jiang, Zhongyang Xiao, Jianfeng Feng, Li Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that DG-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, and novel-view synthesis in dynamic scenes, outperforming existing methods meanwhile preserving real-time rendering ability.
Researcher Affiliation Collaboration Yueming Xu1 Haochen Jiang1 Zhongyang Xiao2 Jianfeng Feng1 Li Zhang1 1Fudan University 2Autonomous Driving Division, NIO
Pseudocode No The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code Yes https://github.com/fudan-zvg/DG-SLAM
Open Datasets Yes Our methodology is evaluated using three publicly available challenging datasets: TUM RGB-D dataset [38], BONN RGB-D Dynamic dataset [5] and Scan Net [39].
Dataset Splits No The paper discusses training and optimization on datasets but does not explicitly provide details about separate training, validation, and test splits with percentages or sample counts.
Hardware Specification Yes We run our DG-SLAM on an RTX 3090 Ti GPU at 2 FPS on BONN datasets, which takes roughly 9GB of memory.
Software Dependencies No We utilize Oneformer [41] to generate prior semantic segmentation. For the depth wrap mask, we set the window size to 4 and the depth threshold to 0.6.
Experiment Setup Yes We set the loss weight λ1 = 0.9 , λ2 = 0.2 and λ3 = 0.1 to train our model. The number of iterations for the tracking and mapping processes has been set to 20 and 40, respectively. For the Gaussian points deleting, we set τα = 0.005, τS1 = 0.4 and τS2 = 36 to avoid the generation of abnormal Gaussian points. What s more, we utilize Oneformer [41] to generate prior semantic segmentation. For the depth wrap mask, we set the window size to 4 and the depth threshold to 0.6. We also adopt the keyframe selection strategy from DROID-VO [19] based on optical flow.