Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Unified Framework for 3D Scene Understanding

Authors: Wei Xu, Chunsheng Shi, Sifan Tu, Xin Zhou, Dingkang Liang, Xiang Bai

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three benchmarks, including Scan Net20, Scan Refer, and Scan Net200, demonstrate that the Uni Seg3D consistently outperforms current SOTA methods, even those specialized for individual tasks.
Researcher Affiliation Academia Wei Xu , Chunsheng Shi , Sifan Tu, Xin Zhou, Dingkang Liang, Xiang Bai Huazhong University of Science and Technology EMAIL
Pseudocode No The paper describes its methodology in Section 3 and illustrates it with Fig. 2, but it does not provide formal pseudocode or algorithm blocks.
Open Source Code Yes Code and models are available at https://dk-liang.github.io/Uni Seg3D/.
Open Datasets Yes Datasets. We evaluate the Uni Seg3D on three benchmarks: Scan Net20 [6], Scan Refer [1], and Scan Net200 [47].
Dataset Splits Yes All models are trained for 512 epochs on a single NVIDIA RTX 4090 GPU and evaluated per 16 epochs on the validation set to find the best-performed model.
Hardware Specification Yes All models are trained for 512 epochs on a single NVIDIA RTX 4090 GPU and evaluated per 16 epochs on the validation set to find the best-performed model.
Software Dependencies No The paper mentions using a 'frozen CLIP [46] text encoder' but does not specify version numbers for any software dependencies.
Experiment Setup Yes We adopt the Adam W optimizer with the polynomial schedule, setting an initial learning rate as 0.0001 and the weight decay as 0.05. All models are trained for 512 epochs on a single NVIDIA RTX 4090 GPU and evaluated per 16 epochs on the validation set to find the best-performed model.