SA3DIP: Segment Any 3D Instance with Potential 3D Priors
Authors: Xi Yang, Xu Gu, Xingyilang Yin, Xinbo Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluations on various 2D-3D datasets demonstrate the effectiveness and robustness of our approach. |
| Researcher Affiliation | Academia | Xi Yang1, Xu Gu1, Xingyilang Yin1 , Xinbo Gao2 1Xidian University, 2Chongqing University of Posts and Telecommunications |
| Pseudocode | Yes | Algorithm 1 Instance-aware refinement |
| Open Source Code | Yes | Our code and proposed Scan Net V2-INS dataset are available HERE. |
| Open Datasets | Yes | Scan Net [11] integrates a comprehensive array of 2D and 3D data sourced from indoor environments, facilitated by an i Pad application in tandem with depth sensors. This dataset includes RGB and depth images, along with 3D point cloud data, all meticulously annotated with semantic and instance labels. It encompasses an extensive collection of over 2.5 million views derived from more than 1500 scans. In contrast, Scan Net++ [12] represents a recently introduced indoor dataset exhibiting a similar composition to Scan Net but boasting higher-resolution 3D geometry and more detailed data annotations. Matterport3D [44] and Replica [45] |
| Dataset Splits | Yes | With the aid of a recently released annotation tool AGILE3D proposed in [43], we perform point-level updates on the ground truth annotations for all 312 scenes in the validation set efficiently. |
| Hardware Specification | Yes | We conduct all experiments on a single RTX4090. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4). |
| Experiment Setup | Yes | The weights for geometry and texture used in the complementary primitives generation are set as wn = 0.96 and wc = 0.04. This is because texture prior such as RGB values are not robust enough when being used solely due to lighting conditions, reflections, shadows, and noise collected by sensors. We conduct detailed ablation study on the choice of the two weights in the later section. The threshold δ1 in the region growing is empirically set as [0.9, 0.8, 0.7, 0.6, 0.5] for Scan Net V2 and Scan Net V2-INS, [0.9, 0.8, 0.7] for Scan Net++, and the threshold δ2 in instance-aware refinement is set as 0.75 experimentally. |