MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps
Authors: Yating Xu, Chen Li, Gim Hee Lee
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on Scan Net and ARKit Scenes datasets are conducted to show the superiority of our model. (Abstract) The experiments support our claims. (NeurIPS Paper Checklist, Q1) We do not propose any theories. (NeurIPS Paper Checklist, Q3) |
| Researcher Affiliation | Academia | Department of Computer Science, National University of Singapore1 Institute of High Performance Computing, A*STAR2 Centre for Frontier AI Research, A*STAR3 (First page) |
| Pseudocode | No | The paper describes the method steps in prose and mathematical formulas but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Pixie8888/MVSDet. (Abstract) |
| Open Datasets | Yes | We conduct experiments on the Scan Net and ARKit Scenes datasets. (Section 4) We properly cite the datasets we use. (NeurIPS Paper Checklist, Q12) |
| Dataset Splits | Yes | Scan Net has 1,201 and 312 scans for training and testing, respectively. ARKit Scenes has 4,498 and 549 scans for training and testing, respectively. (Section 4.1) |
| Hardware Specification | Yes | All experiments are conducted on two NVIDIA A6000 GPUs. (Section 4.2) All models are ran on 2 A6000 GPUs. (Section 4.5) |
| Software Dependencies | No | We use Adam W optimizer with learning rate 0.0002, total epochs of 12 and batchsize of 1. All experiments are conducted on two NVIDIA A6000 GPUs. (Section 4.2) No specific versions for Python, PyTorch, CUDA, etc., are provided. |
| Experiment Setup | Yes | images are resized into (240, 320). During training, we input 40 images to the detection branch. During testing, the detection branch takes in 100 images... The size of the 3D volume is (Nx = 40, Ny = 40, Nz = 16)... The depth range is empirically set as [0.2m, 5m]. The number of depth planes M is set to 12 and k = 3 depth proposals... We use Adam W optimizer with learning rate 0.0002, total epochs of 12 and batchsize of 1. (Section 4.2) |