reproducibilityindex.ai

Context and Geometry Aware Voxel Transformer for Semantic Scene Completion

Authors: Zhu Yu, Runmin Zhang, Jiacheng Ying, Junchen Yu, Xiaohai Hu, Lun Luo, Si-Yuan Cao, Hui-liang Shen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that CGFormer achieves state-of-the-art performance on the Semantic KITTI and SSCBench-KITTI-360 benchmarks, attaining a m Io U of 16.87 and 20.05, as well as an Io U of 45.99 and 48.07, respectively.
Researcher Affiliation	Collaboration	Zhu Yu1 Runmin Zhang1 Jiacheng Ying1 Junchen Yu1 Xiaohai Hu3 Lun Luo4 Si-Yuan Cao2,1 Hui-Liang Shen1 1Zhejiang University 2Ningbo Innovation Center, Zhejiang University 3University of Washington 4HAOMO.AI Technology Co., Ltd.
Pseudocode	No	The paper describes the architecture and processes using diagrams and text, but it does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	https://github.com/pkqbajng/CGFormer
Open Datasets	Yes	We evaluate our CGFormer on two datasets: Semantic KITTI [1] and SSC-Bench-KITTI-360 [22].
Dataset Splits	Yes	Semantic KITTI provides RGB images... The dataset includes 10 sequences for training, 1 sequence for validation, and 11 sequences for testing. SSC-Bench-KITTI-360 [22] offers 7 sequences for training, 1 sequence for validation, and 1 sequence for testing.
Hardware Specification	Yes	We train CGFormer for 25 epochs on 4 NVIDIA 4090 GPUs, with a batch size of 4. It approximately consumes 19 GB of GPU memory on each GPU during the training phase.
Software Dependencies	No	Consistent with previous researches [13, 3, 47], we utilize a 2D UNet based on a pretrained Efficient Net B7 [41] as the image backbone. ... Swin T [30] is employed as the 2D backbone in the TPV-based branch.
Experiment Setup	Yes	We train CGFormer for 25 epochs on 4 NVIDIA 4090 GPUs, with a batch size of 4. It approximately consumes 19 GB of GPU memory on each GPU during the training phase. We employ the Adam W [32] optimizer with β1 = 0.9, β2 = 0.99 and set the maximum learning rate to 3 × 10−4. The cosine annealing learning rate strategy is adopted for the learning rate decay, where the cosine warmup strategy is applied for the first 5% iterations.