Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection
Authors: Cheng-Ju Ho, Chen-Hsuan Tai, Yen-Yu Lin, Ming-Hsuan Yang, Yi-Hsuan Tsai
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the Scan Net and SUN RGB-D benchmark datasets to demonstrate that our approach achieves state-of-the-art performance against existing methods. |
| Researcher Affiliation | Collaboration | Cheng-Ju Ho1 Chen-Hsuan Tai1 Yen-Yu Lin1 Ming-Hsuan Yang2,3 Yi-Hsuan Tsai3 1National Yang Ming Chiao Tung University 2University of California at Merced 3Google |
| Pseudocode | Yes | Algorithm 1: Teacher Model" and "Algorithm 2: Student Model |
| Open Source Code | Yes | The source code will be available at https://github.com/luluho1208/Diffusion-SS3D. |
| Open Datasets | Yes | We evaluate our method on two benchmarks, including the Scan Net [12] and SUN RGB-D [46] datasets, with the evaluation settings adopted in the prior semi-supervised 3D object detection works [16, 53, 63]. |
| Dataset Splits | Yes | We split both benchmarks into labeled and unlabeled data for SSL, using labeled data ratios of 5%, 10%, and 20% for Scan Net and 1%, 5%, and 10% for SUN RGB-D. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) are mentioned for the experiments. |
| Software Dependencies | No | In this work, we employ Point Net++ [31] as the encoder and Io U-aware Vote Net [53] as the decoder. |
| Experiment Setup | Yes | In the pre-training phase, we only use labeled data with a batch size of 4 to train the diffusion model. The model is trained for 900 epochs with an initial learning rate of 0.005. Like [16, 53], the learning rate then decays at the 400th, 600th, and 800th epochs with a factor of 0.1. In the phase of semi-supervised learning, a batch is composed of 4 labeled and 8 unlabeled data. The pre-trained model is used for initializing both the teacher and student models. The student model is trained for 1,000 epochs using the Adam W optimizer, with an initial learning rate of 0.005. Like [16, 53], the learning rate decays at the 400th, 600th, 800th, and 900th epochs with factors of 0.3, 0.3, 0.1, and 0.1, respectively. For the diffusion process, we set the maximum timesteps to 1000 and the number of proposal boxes Nb to 128. |