OctAttention: Octree-Based Large-Scale Contexts Model for Point Cloud Compression

Authors: Chunyang Fu, Ge Li, Rui Song, Wei Gao, Shan Liu625-633

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Compared to the previous state-of-the-art works, our approach obtains a 10%-35% BD-Rate gain on the Li DAR benchmark (e.g. Semantic KITTI) and object point cloud dataset (e.g. MPEG 8i, MVUB), and saves 95% coding time compared to the voxel-based baseline. The experiments show that our method outperforms these state-of-the-art methods, which are only designed for a specific category of point clouds. We perform an ablation experiment on Semantic KITTI to demonstrate the effectiveness of a large receptive field context.
Researcher Affiliation Collaboration Chunyang Fu1,2, Ge Li1, Rui Song1, Wei Gao1,2*, Shan Liu3 1School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School 2Peng Cheng Laboratory 3Tencent America
Pseudocode No The paper describes the proposed method in prose and equations, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/zb12138/Oct Attention.
Open Datasets Yes Li DAR Dataset Semantic KITTI (Behley et al. 2019) is a large sparse Li DAR dataset for self-driving. Object Point Cloud Dataset Microsoft Voxelized Upper Bodies (MVUB) (Charles et al. 2016) is a dynamic voxelized point cloud dataset. 8i Voxelized Full Bodies (MPEG 8i) (Eugene et al. 2017) includes sequences of smooth surface and complete human shape point clouds.
Dataset Splits No The paper specifies training and testing splits for the Semantic KITTI dataset ('sequences 00 to 10... for training, and sequences 11 to 21... for testing'), but does not mention a separate validation split or explicit details for one.
Hardware Specification Yes We implement our model in Py Torch and perform the training/testing with Xeon E5-2637 CPU and one NVIDIA TITAN Xp GPU (12G memory).
Software Dependencies No The paper states 'We implement our model in Py Torch' but does not specify any version numbers for PyTorch or any other software dependencies used in the experiments.
Experiment Setup Yes We use batch sizes of 32, epochs of 8 and Adam optimizer with a learning rate of 1e 3. Occupancy, level index, and octant index are embedded into 128, 6, and 4 dimensions, respectively. We set K = 4, N = N0 = 1024 and use 2 layers and 4 heads in multi-head self-attention in experiments unless otherwise specified.