iDet3D: Towards Efficient Interactive Object Detection for LiDAR Point Clouds
Authors: Dongmin Choi, Wonwoo Cho, Kangyeol Kim, Jaegul Choo
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through our extensive experiments, we present that our method can construct precise annotations in a few clicks, which shows the practicality as an efficient annotation tool for 3D object detection.In the experiments, our main interest is two-fold: (1) demonstrating the effectiveness of the interactive approach compared to state-of-the-art automatic (non-interactive) detectors, i.e., the performance can be significantly improved by using a few user clicks, and (2) validating the effectiveness of the proposed components of i Det3D (NCS, DCG, and SCP) by comparing i Det3D to the vanilla model. |
| Researcher Affiliation | Collaboration | Dongmin Choi1,2*, Wonwoo Cho1,2*, Kangyeol Kim1,2, Jaegul Choo1,2 1Letsur Inc. 2Korea Advanced Institute of Science and Technology {dmchoi, wcho, kangyeolk, jchoo}@kaist.ac.kr |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper mentions referring to existing codebases for IA-SSD and SASA, but does not provide concrete access or an explicit statement about releasing the source code for iDet3D. |
| Open Datasets | Yes | Datasets. KITTI benchmark (Geiger, Lenz, and Urtasun 2012) is a widely used 3D object detection dataset... We also evaluate i Det3D on a more challenging nu Scenes dataset (Caesar et al. 2020)... |
| Dataset Splits | Yes | KITTI benchmark (Geiger, Lenz, and Urtasun 2012) is a widely used 3D object detection dataset, which consists of 3,712 training and 3,769 validation samples with three object classes: Car, Pedestrian, and Cyclist. |
| Hardware Specification | Yes | We use 4 NVIDIA RTX A6000 GPUs for experiments. |
| Software Dependencies | No | The paper refers to existing codebases (IA-SSD, SASA) but does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | The number of clicks K is determined as min (Nu, No), where Nu is sampled from {0, , 10} uniformly at random and No refers to the number of existing objects in each scene. The distance threshold τ of user encodings is set to 2.0 in Eq. (1). For negative clicks, we set the maximum number Kn to 10. |