OpenDlign: Open-World Point Cloud Understanding with Depth-Aligned Images
Authors: Ye Mao, JUNPENG JING, Krystian Mikolajczyk
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that Open Dlign achieves high zero-shot and few-shot performance on diverse 3D tasks, despite only fine-tuning 6 million parameters on a limited Shape Net dataset. |
| Researcher Affiliation | Academia | Ye Mao Junpeng Jing Krystian Mikolajczyk Imperial College London |
| Pseudocode | No | The paper describes methods with steps and includes one formula (Equation 1) but does not present any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | Yes | The supplementary material includes the complete code for image generation, model training, and evaluation. |
| Open Datasets | Yes | We first evaluated Open Dlign under the zero-shot shape classification task on four benchmark datasets: Model Net40 [46], Scan Object NN [47], Omni Object3D [48], and Objaverse-LVIS [23]. Point cloud sizes are 10,000 points for Model Net40 and Objaverse-LVIS, 2,048 for Scan Object NN, and 4,096 for Omni Object3D. |
| Dataset Splits | No | The paper mentions using datasets like Model Net40 and Objaverse-LVIS and training on Shape Net, but it does not specify explicit training, validation, and test splits (e.g., percentages or sample counts for each split). |
| Hardware Specification | Yes | The multimodal alignment was achieved by fine-tuning 10 epochs on an A100-80 GB GPU... The entire generation process for Shape Net [21] spanned 16 days on 8 RTX 6000 GPUs. |
| Software Dependencies | No | The paper mentions 'implemented in Py Torch' and uses 'Control Net v1.1 [40]' and 'Open CLIP-Vi T-H-14' but does not provide specific version numbers for PyTorch or other general software dependencies, which would be necessary for full reproducibility. |
| Experiment Setup | Yes | The multimodal alignment was achieved by fine-tuning 10 epochs on an A100-80 GB GPU, employing the Adam W optimizer and the One Cycle scheduler with a peak learning rate of 3 10 4 and a batch size of 128. |