Memorize What Matters: Emergent Scene Decomposition from Multitraverse
Authors: Yiming Li, Zehong Wang, Yue Wang, Zhiding Yu, Zan Gojcic, Marco Pavone, Chen Feng, Jose M. Alvarez
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We build the Mapverse benchmark, sourced from the Ithaca365 and nu Plan datasets, to evaluate our method in unsupervised 2D segmentation, 3D reconstruction, and neural rendering. Extensive results verify the effectiveness and potential of our method for self-driving and robotics. 5 Experiments |
| Researcher Affiliation | Collaboration | Yiming Li1,2 Zehong Wang1 Yue Wang2,3 Zhiding Yu2 Zan Gojcic2 Marco Pavone2,4 Chen Feng1 Jose M. Alvarez2 1NYU 2NVIDIA 3USC 4Stanford University |
| Pseudocode | Yes | A.2 Workflow of Feature Residuals Mining Algorithm 1 Workflow of Feature Residuals Mining |
| Open Source Code | Yes | Code and data are released at https://github.com/NVlabs/3DGM. |
| Open Datasets | Yes | We build the Mapverse benchmark, sourced from the Ithaca365 [5] and nu Plan [6] datasets, to evaluate our method in unsupervised 2D segmentation, 3D reconstruction, and neural rendering. |
| Dataset Splits | No | We set test/training views as 1/8. |
| Hardware Specification | Yes | All experiments are conducted on a single NVIDIA RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions using specific techniques and models like PCA, KL divergence, L1 loss, DINOv2 [4] features, and Open CV functions [68], but it does not specify version numbers for general software dependencies such as Python, PyTorch/TensorFlow, or other libraries. |
| Experiment Setup | Yes | For efficiency, we compress feature dimensions from 768 to 64 using PCA. Our model uses KL divergence for feature alignment and L1 loss for RGB reconstruction. All experiments are conducted on a single NVIDIA RTX 3090 GPU. Algorithm 1 Workflow of Feature Residuals Mining Input: Feature residuals {Lfeat(Ft(ξt; G), Ft)}t=1,2,...,T , activation threshold δ1 = 0.3, size threshold δ2 = 100, skyline threshold δ3 = 0.7, merging threshold δ4 = 10, and default parameters for contour detection. |