MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps

Authors: Yating Xu, Chen Li, Gim Hee Lee

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on Scan Net and ARKit Scenes datasets are conducted to show the superiority of our model. (Abstract) The experiments support our claims. (NeurIPS Paper Checklist, Q1) We do not propose any theories. (NeurIPS Paper Checklist, Q3)
Researcher Affiliation Academia Department of Computer Science, National University of Singapore1 Institute of High Performance Computing, A*STAR2 Centre for Frontier AI Research, A*STAR3 (First page)
Pseudocode No The paper describes the method steps in prose and mathematical formulas but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/Pixie8888/MVSDet. (Abstract)
Open Datasets Yes We conduct experiments on the Scan Net and ARKit Scenes datasets. (Section 4) We properly cite the datasets we use. (NeurIPS Paper Checklist, Q12)
Dataset Splits Yes Scan Net has 1,201 and 312 scans for training and testing, respectively. ARKit Scenes has 4,498 and 549 scans for training and testing, respectively. (Section 4.1)
Hardware Specification Yes All experiments are conducted on two NVIDIA A6000 GPUs. (Section 4.2) All models are ran on 2 A6000 GPUs. (Section 4.5)
Software Dependencies No We use Adam W optimizer with learning rate 0.0002, total epochs of 12 and batchsize of 1. All experiments are conducted on two NVIDIA A6000 GPUs. (Section 4.2) No specific versions for Python, PyTorch, CUDA, etc., are provided.
Experiment Setup Yes images are resized into (240, 320). During training, we input 40 images to the detection branch. During testing, the detection branch takes in 100 images... The size of the 3D volume is (Nx = 40, Ny = 40, Nz = 16)... The depth range is empirically set as [0.2m, 5m]. The number of depth planes M is set to 12 and k = 3 depth proposals... We use Adam W optimizer with learning rate 0.0002, total epochs of 12 and batchsize of 1. (Section 4.2)