PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds

Authors: Hao Yang, Haiyang Wang, Di Dai, Liwei Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate PRED s superiority over prior point cloud pre-training methods, providing significant improvements on various large-scale datasets for 3D perception tasks.
Researcher Affiliation Academia Hao Yang1,3 Haiyang Wang1 Di Dai2 Liwei Wang1,2 1Center for Data Science, Peking University 2National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University 3Pazhou Lab {haoy@stu, wanghaiyang@stu, didai@stu, wanglw@cis}.pku.edu.cn
Pseudocode No The paper describes the methodological steps and provides mathematical equations but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code Yes Codes will be available at https://github.com/PRED4pc/PRED.
Open Datasets Yes Dataset: nu Scenes (4) is a challenging outdoor dataset providing diverse annotations for various tasks... Dataset: ONCE (26) is a large-scale autonomous driving dataset...
Dataset Splits Yes We present the performance on the validation and test sets of nu Scenes in Table 1. The dataset is split into training, validation, and testing sets consisting of 5k, 3k, and 8k point clouds, respectively.
Hardware Specification Yes All experiments are conducted on NVIDIA V100 GPUs.
Software Dependencies No The paper mentions using "Deep Lab V3 (7) with Mobile Nets (18) as the backbone" and "Adam W (25) optimizer" but does not specify version numbers for these software components or for general programming environments like Python, PyTorch, or CUDA.
Experiment Setup Yes During pre-training, we train the model using the Adam W (25) optimizer and the one-cycle policy, with a maximum learning rate of 3e 4. We pre-train the model for 45 epochs on the nu Scenes dataset, 20 epochs on the ONCE small, 5 epochs on the ONCE medium, and 3 epochs on the ONCE large. During fine-tuning, we employ random flipping, scaling, rotation, and copy-n-paste as data augmentations, with a maximum learning rate of 3e 3.