Semantic Consistency Networks for 3D Object Detection

Authors: Wenwen Wei, Ping Wei, Nanning Zheng2861-2869

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our model is evaluated on two challenging datasets and achieves comparable results to the state-of-the-art methods.
Researcher Affiliation Academia Wenwen Wei, Ping Wei , Nanning Zheng Xi an Jiaotong University, Xi an, China wwwei@stu.xjtu.edu.cn, pingwei@xjtu.edu.cn, nnzheng@mail.xjtu.edu.cn
Pseudocode No The paper describes the architecture and components of SCNet, but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Our proposed SCNet is evaluated on two challenging datasets for 3D object detection task. SUN RGB-D (Song, Lichtenberg, and Xiao 2015) is an indoor benchmark for 3D scene understanding. Scan Net V2 (Dai et al. 2017) is a richly annotated indoor dataset containing 2.5M views in 1,513 scenes.
Dataset Splits No Following the standard split, 5,285 RGB-D image pairs are used as the training samples and the rest for testing. The training set contains 1,201 samples and the test set contains 312 samples. The paper specifies training and testing splits, but does not explicitly mention a separate validation split or how it was derived.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU, CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'Adaptive Moment Estimation (Adam)' for optimization, but does not specify versions for any programming languages, libraries, or other software components.
Experiment Setup Yes The entire model is trained end-to-end for 180 epochs with a mini-batch size of 8. We adopt Adaptive Moment Estimation (Adam) (Kingma and Ba 2015) for optimization with the initial learning rate of 0.0015 for SUN RGB-D and 0.01 for Scan Net V2. The learning rate decay steps are set to [100, 130, 160] and the corresponding decay rates are [0.1, 0.1, 0.1]. We organize the input of our model by sub-sampling a certain number of points (N = 40, 000) from original data. For data augmentation, the point clouds are randomly flipped in both horizontal directions, rotated by Uniform [ 5 , 5 ], and scaled by Uniform [0.9, 1.1] as in Vote Net (Qi et al. 2019).