TANet: Robust 3D Object Detection from Point Clouds with Triple Attention
Authors: Zhe Liu, Xin Zhao, Tengteng Huang, Ruolan Hu, Yu Zhou, Xiang Bai11677-11684
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the validation set of KITTI dataset demonstrate that, in the challenging noisy cases, i.e., adding additional random noisy points around each object, the presented approach goes far beyond state-of-the-art approaches. Furthermore, for the 3D object detection task of the KITTI benchmark, our approach ranks the first place on Pedestrian class, by using the point clouds as the only input. |
| Researcher Affiliation | Academia | Zhe Liu,1 Xin Zhao,2 Tengteng Huang,1 Ruolan Hu,1 Yu Zhou,1 Xiang Bai1 1Huazhong University of Science and Technology, Wuhan, China, 430074 2Institute of Automation, Chinese Academy of Sciences, Beijing, China, 100190 |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code or a link to a code repository. |
| Open Datasets | Yes | All of the experiments are conducted on the KITTI dataset (Geiger, Lenz, and Urtasun ), which contains 7481 training samples and 7518 test samples. |
| Dataset Splits | Yes | we follow (Qi et al. 2018; Chen et al. 2017) to split the training samples into a training set consisting of 3712 samples and a validation set consisting of 3769 samples. |
| Hardware Specification | Yes | All of our experiments are evaluated on a single Titan V GPU card. |
| Software Dependencies | No | The paper mentions optimization and loss functions but does not provide specific version numbers for software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Adaptive Moment Estimation (Adam) (Kingma and Ba 2015) is used for optimization with the learning rate of 0.0002. And our model is trained for about 200 epochs with a mini-batch size of 2. In our settings, a large value of N is selected to be 100 for capturing sufficient cues to explore the spatial relationships. The dimension of the feature map for each voxel FM is 64 (e.g., C = 64). In our experiments, we set α, β, and λ to 1.0, 2.0 and 2.0 for total loss, respectively. |