4D Unsupervised Object Discovery
Authors: Yuqi Wang, Yuntao Chen, ZHAO-XIANG ZHANG
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the large-scale Waymo Open Dataset suggest that the localization network and Cluster Net achieve competitive performance on both class-agnostic 2D object detection and 3D instance segmentation, bridging the gap between unsupervised methods and full supervised ones. |
| Researcher Affiliation | Academia | 1 Center for Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA) 2 School of Artificial Intelligence, University of Chinese Academy of Sciences 3 Centre for Artificial Intelligence and Robotics, HKISI_CAS |
| Pseudocode | No | The paper describes the algorithm steps in paragraph text and uses figures but does not provide structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Codes and models will be made available at https://github.com/Robertwyq/LSMOL. |
| Open Datasets | Yes | We evaluate our method on the challenging Waymo Open Dataset (WOD) [36], which provides 3D point clouds and 2D RGB image data that is suitable for our task setting. |
| Dataset Splits | Yes | The training and validation sets contain around 158k and 40k frames, respectively. |
| Hardware Specification | Yes | The network is trained on 8 GPUs (A100) with 2 images per GPU for 12k iterations. |
| Software Dependencies | No | The paper states: 'Our implementation is based on the open-sourced code of mmdetection3d [9] for 3D detection and detectron2 [44] for 2D detection.' While the underlying frameworks are mentioned, specific version numbers for these libraries or other software dependencies (like Python, PyTorch, etc.) are not provided. |
| Experiment Setup | Yes | The network is trained on 8 GPUs (A100) with 2 images per GPU for 12k iterations. The learning rate is initialized to 0.02 and is divided by 10 at the 6k and the 9k iterations. The weight decay and the momentum parameters are set as 10 4 and 0.9, respectively. For 3D Cluster Net... The voxel size is (0.32<, 0.32<, 6<)... In the focal loss for class prediction, we set W = 2.0, U = 0.8. The balance weight _ for Eq. 3 is set to 5. During inference, we set the minimum number of points to 5 for clustering. The Cluster Net is trained on 8 GPUs (A100) with 2 point clouds per GPU for 12 epochs. The learning rate is initialized to 10 5 and adopts the cyclic cosine strategy (target learning rate is 10 3). For hyper-parameters in HDBSCAN [5], we set the min cluster size to 15, and the others follow the default. |