OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding
Authors: Minghua Liu, Ruoxi Shi, Kaiming Kuang, Yinhao Zhu, Xuanlin Li, Shizhong Han, Hong Cai, Fatih Porikli, Hao Su
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Open Shape on zero-shot 3D classification benchmarks and demonstrate its superior capabilities for open-world recognition. Specifically, Open Shape achieves a zero-shot accuracy of 46.8% on the 1,156-category Objaverse-LVIS benchmark, compared to less than 10% for existing methods. Open Shape also achieves an accuracy of 85.3% on Model Net40, outperforming previous zero-shot baseline methods by 20% and performing on par with some fully-supervised methods. |
| Researcher Affiliation | Collaboration | Minghua Liu1 Ruoxi Shi2 Kaiming Kuang1 Yinhao Zhu3 Xuanlin Li1 Shizhong Han3 Hong Cai3 Fatih Porikli3 1 UC San Diego 2 Shanghai Jiao Tong University 3 Qualcomm AI Research |
| Pseudocode | No | The paper does not include any section or figure explicitly labeled as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Project Website: https://colin97.github.io/Open Shape/ |
| Open Datasets | Yes | we ensemble four currently-largest public 3D datasets for training as shown in Figure 2 (a), resulting in 876k training shapes. Among these four datasets, Shape Net Core [8], 3D-FUTURE [16] and ABO [11] are three popular datasets used by prior works. They contain human-verified high-quality 3D shapes, but only cover a limited number of shapes and dozens of categories. The Objaverse [12] dataset is a more recent dataset |
| Dataset Splits | No | The paper describes the datasets used for training (ensembling Shape Net Core, 3D-FUTURE, ABO, and Objaverse) and the test splits of evaluation benchmarks (Model Net40 and Scan Object NN), but it does not specify explicit train/validation splits or cross-validation setup for its own model training and hyperparameter tuning. |
| Hardware Specification | Yes | We train the model on a single A100 GPU with a batch size of 200. |
| Software Dependencies | No | The paper mentions using 'Open CLIP Vi T-big G-14', 'BLIP', 'Azure cognition services', and 'GPT-4' but does not specify their version numbers or other software dependencies with version numbers. |
| Experiment Setup | Yes | We train the model on a single A100 GPU with a batch size of 200. ... For 32.3M version of Point BERT, we utilize a learning rate of 5e 4; for 72.1M version of Point BERT, we utilize a learning rate of 4e 4; and for other models, we utilize a learning rate of 1e 3. For hard-negative mining, the number of seed shapes s is set to 40, and the number of neighbors m is set to 5 per shape, and the threshold δ is set to 0.1. |