Attention-Based Multi-Modal Fusion Network for Semantic Scene Completion
Authors: Siqi Li, Changqing Zou, Yipeng Li, Xibin Zhao, Yue Gao11402-11409
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our method on both the synthetic SUNCG-RGBD dataset and the real NYUv2 dataset and the results show that our method respectively achieves the gains of 2.5% and 2.6% on the synthetic SUNCG-RGBD dataset and the real NYUv2 dataset against the state-of-the-art method. |
| Researcher Affiliation | Collaboration | Siqi Li,1 Changqing Zou,2 Yipeng Li,3 Xibin Zhao,1 Yue Gao1 1BNRist, KLISS, School of Software, Tsinghua University, China 2Huawei Noah s Ark Lab, 3Department of Automation, Tsinghua University, China |
| Pseudocode | No | The paper describes the network architecture and process flow in detail with text and diagrams (Figure 2 and 3), but it does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We validate our method on both the synthetic SUNCG-RGBD dataset and the real NYUv2 dataset. The NYUv2 (Silberman et al. 2012) is a real scene dataset, consisting of 1449 indoor scenes. SUNCGRGBD, a synthetic dataset proposed by Liu et al. (Liu et. al. 2018), is a subset of the SUNCG dataset (Song et al. 2017). |
| Dataset Splits | Yes | The NYUv2 (Silberman et al. 2012) is a real scene dataset, consisting of 1449 indoor scenes. The dataset is divided into 795 training and 654 testing samples, each scene associated with RGB-D images. SUNCGRGBD [...] It consists of 13011 training samples and 499 testing samples. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments, such as particular GPU or CPU models, or memory configurations. |
| Software Dependencies | No | The paper mentions using a 'cross-entropy loss' and 'SGD optimizer' with specific parameters, but it does not list any specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow, CUDA versions) that would be needed for replication. |
| Experiment Setup | Yes | The training procedure consists of two steps. We first pre-train the 2D segmentation network with the supervision of 2D semantic segmentation ground truth, and then train the whole model end-to-end. We use cross-entropy loss and an SGD optimizer with a momentum of 0.9, a weight decay of 5e-4, and a batch size of 1. The learning rate of the 2D segmentation network and 3D scene completion network is 0.001 and 0.01, respectively. |