ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale LiDAR Point Clouds

Authors: Xiangze Jia, Hui Zhou, Xinge Zhu, Yandong Guo, Ji Zhang, Yuexin Ma

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show the effectiveness and superiority of our Contrast Motion on both scene flow and motion prediction tasks.
Researcher Affiliation Collaboration Xiangze Jia1 , Hui Zhou2 , Xinge Zhu2 , Yandong Guo3 , Ji Zhang1,4 , Yuexin Ma5 1Nanjing University of Aeronautics and Astronautics 2The Chinese University of Hong Kong 3OPPO Research Institute 4Zhejiang Lab 5Shanghai Tech University
Pseudocode No No pseudocode or algorithm blocks were found.
Open Source Code Yes Corresponding author https://github.com/JXZxiaowu/Contrast Motion
Open Datasets Yes 4.1 Dateset KITTI Scene Flow [Menze et al., 2015; Menze et al., 2018] consists of 200 training scenes and 200 test scenes. ... nu Scenes [Caesar et al., 2019] is a large-scale public dataset for autonomous driving collected from 1000 real scenes...
Dataset Splits Yes As same as Pillar Motion [Luo et al., 2021], we use the same 500 training scenes, 100 validation scenes and 250 testing scenes with Pillar Motion [Luo et al., 2021].
Hardware Specification No The paper mentions 'a 32G GPU' but does not specify the model or type of GPU (e.g., NVIDIA A100, RTX 3090, etc.), nor does it specify CPU details or other hardware components.
Software Dependencies No The paper mentions 'Py Torch [Paszke et al., 2019]' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes For FT3D and nu Scenes datasets, we train the Contrast Motion for 300 and 100 epochs respectively, and set the initial learning rate to 0.001, weight decay to 0.001. The batch size is set to the maximum available on a 32G GPU, and Adam is used as the optimizer. We crop the point clouds following baseline models [Li et al., 2022b; Luo et al., 2021], and the pillar size is set to [0.25, 0.25]... The distance threshold ε in V (P t i ) is set to 1.1 times pillar size... The dimension of output feature map D = 32. In pillar association, the patch size s = 32 and α = 2.