Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Segment then Splat: Unified 3D Open-Vocabulary Segmentation via Gaussian Splatting
Authors: Yiren Lu, Yunlai Zhou, Yiran Qiao, Chaoda Song, Tuo Liang, Jing Ma, Huan Wang, Yu Yin
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on various datasets demonstrate the effectiveness of our proposed method in both static and dynamic scenarios. Extensive experiments demonstrate state-of-the-art performance across diverse static and dynamic datasets in 3D open-vocabulary segmentation, object geometry accuracy, and computational efficiency. |
| Researcher Affiliation | Academia | 1Case Western Reserve University 2Westlake University 1EMAIL EMAIL |
| Pseudocode | No | The paper describes the methodology in detail across sections like 'Robust Object Tracking', 'Object-Specific Gaussian Initialization', 'Optimization & Reconstruction', and 'CLIP Embedding Association', illustrating the pipeline with Figure 2, but it does not present a formal pseudocode or algorithm block. |
| Open Source Code | Yes | The code is released at https://github.com/luyr/Segment-then-Splat. |
| Open Datasets | Yes | To assess the segmentation performance of our proposed method, we conduct experiments on two static scene datasets (i.e., 3DOVS dataset [42] and LERF_OVS dataset [9]) and two dynamic scene datasets (i.e., Hyper Ne RF dataset [43] and Neu3D dataset [44]). |
| Dataset Splits | No | To assess the segmentation performance of our proposed method, we conduct experiments on two static scene datasets (i.e., 3DOVS dataset [42] and LERF_OVS dataset [9]) and two dynamic scene datasets (i.e., Hyper Ne RF dataset [43] and Neu3D dataset [44]). We use mean intersection over union (m Io U) for open-vocabulary segmentation and report optimization time (in minutes) for training efficiency. While the paper uses established datasets for evaluation, it does not explicitly provide details on how these datasets were split into training, validation, and test sets, or reference standard splits for reproducibility within the main text. |
| Hardware Specification | Yes | All experiments are conducted using a RTX A6000 GPU. |
| Software Dependencies | No | The paper mentions leveraging Segment Anything (SAM) [39] and SAM2 [40] for object tracking, and CLIP [21] embeddings, but it does not specify version numbers for these or other software libraries (e.g., Python, PyTorch versions) used in the implementation. |
| Experiment Setup | Yes | The new object detection stride t in the robust object tracking is set to 10. Following the original 3D Gaussian Splatting, we set λr in Lrender to 0.2. For geometric-appearances distance, we set λd to 0.5. In each iteration, we sample 1 object per granularity for 3DOVS to compute Lobj and 3 objects per granularity for all the remaining datasets. The m Io U threshold for partial mask filtering is set to 30%. We train the smaller-scale 3DOVS data for 20K iterations, and larger-scale datasets (i.e., LERF_OVS, Hyper Ne RF, and Neu3D) for 40K iterations. |