reproducibilityindex.ai

Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection

Authors: Cheng-Ju Ho, Chen-Hsuan Tai, Yen-Yu Lin, Ming-Hsuan Yang, Yi-Hsuan Tsai

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on the Scan Net and SUN RGB-D benchmark datasets to demonstrate that our approach achieves state-of-the-art performance against existing methods.
Researcher Affiliation	Collaboration	Cheng-Ju Ho1 Chen-Hsuan Tai1 Yen-Yu Lin1 Ming-Hsuan Yang2,3 Yi-Hsuan Tsai3 1National Yang Ming Chiao Tung University 2University of California at Merced 3Google
Pseudocode	Yes	Algorithm 1: Teacher Model" and "Algorithm 2: Student Model
Open Source Code	Yes	The source code will be available at https://github.com/luluho1208/Diffusion-SS3D.
Open Datasets	Yes	We evaluate our method on two benchmarks, including the Scan Net [12] and SUN RGB-D [46] datasets, with the evaluation settings adopted in the prior semi-supervised 3D object detection works [16, 53, 63].
Dataset Splits	Yes	We split both benchmarks into labeled and unlabeled data for SSL, using labeled data ratios of 5%, 10%, and 20% for Scan Net and 1%, 5%, and 10% for SUN RGB-D.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory amounts) are mentioned for the experiments.
Software Dependencies	No	In this work, we employ Point Net++ [31] as the encoder and Io U-aware Vote Net [53] as the decoder.
Experiment Setup	Yes	In the pre-training phase, we only use labeled data with a batch size of 4 to train the diffusion model. The model is trained for 900 epochs with an initial learning rate of 0.005. Like [16, 53], the learning rate then decays at the 400th, 600th, and 800th epochs with a factor of 0.1. In the phase of semi-supervised learning, a batch is composed of 4 labeled and 8 unlabeled data. The pre-trained model is used for initializing both the teacher and student models. The student model is trained for 1,000 epochs using the Adam W optimizer, with an initial learning rate of 0.005. Like [16, 53], the learning rate decays at the 400th, 600th, 800th, and 900th epochs with factors of 0.3, 0.3, 0.1, and 0.1, respectively. For the diffusion process, we set the maximum timesteps to 1000 and the number of proposal boxes Nb to 128.