LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling
Authors: Yaohua Zha, Naiqi Li, Yanzi Wang, Tao Dai, Hang Guo, Bin Chen, Zhi Wang, Zhihao Ouyang, Shu-Tao Xia
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results show that our compact model significantly surpasses existing Transformer-based models in both performance and efficiency, especially our LCMbased Point-MAE model, compared to the Transformer-based model, achieved an improvement of 1.84%, 0.67%, and 0.60% in average accuracy on the three variants of Scan Object NN while reducing parameters by 88% and computation by 73%. |
| Researcher Affiliation | Collaboration | 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Institute of Visual Intelligence, Pengcheng Laboratory 3College of Computer Science and Software Engineering, Shenzhen University 4Harbin Institute of Technology, Shenzhen 5Bytedance Inc. |
| Pseudocode | No | The paper describes methods through figures and textual explanations but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/zyh16143998882/LCM. |
| Open Datasets | Yes | We pre-training our LCM using five different pretraining strategies: Point-BERT [60], Mask Point [28], Point-MAE [37], Point-M2AE [65], and ACT [8]. For a fire comparison, we use Shape Net [3] as our pre-training dataset, encompassing over 50,000 distinct 3D models spanning 55 prevalent object categories. We initially assess the overall classification accuracy of our pre-trained models on both real-scanned (Scan Object NN [44]) and synthetic (Model Net40 [57]) datasets. We further assess the object detection performance of our pre-trained model on the more challenging scene-level point cloud dataset, Scan Net V2 [6] |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide the specific train/validation/test dataset splits (percentages or counts) or reference predefined splits with sufficient detail for reproduction in the main text. |
| Hardware Specification | Yes | Due to the surprisingly lightweight and efficient of our LCM model, we were able to complete the pre-training tasks using just a single 24GB NVIDIA Ge Force RTX 3090 GPU. For downstream classification and segmentation tasks, we used a single RTX 3090 GPU for each. For detection tasks, to accelerate training, we utilized four parallel RTX 3090 GPUs. |
| Software Dependencies | No | The paper does not explicitly state specific software dependencies with version numbers (e.g., Python, PyTorch, or CUDA versions) required for replication. |
| Experiment Setup | No | The paper states that hyperparameter settings were adopted from previous methods (e.g., 'we used the same settings as previous methods'), and mentions some data augmentation and baseline replacement, but it does not explicitly list specific hyperparameter values (e.g., learning rate, batch size, epochs) or detailed training schedules within the main text. |