Inner-Outer Aware Reconstruction Model for Monocular 3D Scene Reconstruction

Authors: Yu-Kun Qiu, Guo-Hao Xu, Wei-Shi Zheng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results on Scan Net, ICL-NUIM and TUM-RGBD datasets demonstrate the effectiveness and generalization of our model.
Researcher Affiliation Academia 1 School of Computer Science and Engineering, Sun Yat-sen University, China 2 Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, China
Pseudocode No The paper includes Figure 2 which illustrates the overall pipeline, but this is a diagram and not a structured pseudocode block or algorithm.
Open Source Code Yes The code is available at https: //github.com/York Qiu/Inner Outer Aware Reconstruction.
Open Datasets Yes Following previous works [8, 9, 11], we trained models on the Scan Net [40] dataset. ... Previous works evaluate the generalization performance of the models trained on Scan Net on TUM-RGBD [41] and ICL-NUIM [42] datasets.
Dataset Splits Yes We follow the official train/eval/test split, where 1201 videos are used for training, 312 videos are used for evaluating and 100 videos are used for testing.
Hardware Specification Yes Training our model takes about 90 hours on a single Nvidia RTX 3090 graphic card.
Software Dependencies No The paper mentions using 'Adam optimizer [43]', 'Mnas Net-B1 [44]', and 'feature pyramid network [45]' but does not provide specific version numbers for these or other software libraries (e.g., Python, PyTorch versions).
Experiment Setup Yes We use the Adam optimizer [43] with β1 = 0.9, β2 = 0.999 and ϵ = 10 8. The learning rate is set to α = 10 3 and is linearly warmed up from 10 10 over 2000 steps. We trained our model for 500 epochs. ... The CNN backbone is fixed in the first 350 epochs and is finetuned with a learning rate α = 10 4 in the last 150 epochs. The batch size is set to 4 and drops to 2 in the finetuning stage. ... the voxel size of the fine/medium/coarse level is set to 4cm/8cm/16cm and the TSDF truncation distance is set to triple the voxel size.