Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders

Authors: Yaohua Zha, Huizhen Ji, Jinmin Li, Rongsheng Li, Tao Dai, Bin Chen, Zhi Wang, Shu-Tao Xia

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method significantly improves the pre-training efficiency compared to cross-modal alternatives, and extensive downstream experiments underscore the state-of-the-art effectiveness, particularly outperforming our baseline (Point-MAE) by 5.16%, 5.00%, and 5.04% in three variants of Scan Object NN, respectively.
Researcher Affiliation Academia 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Research Center of Artificial Intelligence, Peng Cheng Laboratory 3College of Computer Science and Software Engineering, Shenzhen University 4Harbin Institute of Technology, Shenzhen
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It provides architectural diagrams but no formal algorithm descriptions.
Open Source Code Yes Code is available at https://github.com/ zyh16143998882/AAAI24-Point FEMAE.
Open Datasets Yes We use Shape Net (Chang et al. 2015) as our pre-training dataset... Scan Object NN (Uy et al. 2019)) and synthetic (Model Net40 (Wu et al. 2015)) datasets.
Dataset Splits No The paper mentions using established datasets like ShapeNet, Scan Object NN, and ModelNet40 and states that for few-shot learning on ModelNet40, "we use the above-mentioned n m samples for training, while 20 unseen samples from each category for testing." However, it does not provide explicit overall train/validation/test dataset split percentages or counts for all main tasks, often referring to practices from previous studies without detailing the splits.
Hardware Specification No The paper does not provide specific hardware details (like exact GPU/CPU models or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup No The paper provides details on data processing (e.g., sampling 1024 or 2048 points, dividing into 64 patches of 32 points) and architectural components but does not explicitly list concrete hyperparameter values such as learning rate, batch size, number of epochs, or specific optimizer settings for training.