OpenDlign: Open-World Point Cloud Understanding with Depth-Aligned Images

Authors: Ye Mao, JUNPENG JING, Krystian Mikolajczyk

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that Open Dlign achieves high zero-shot and few-shot performance on diverse 3D tasks, despite only fine-tuning 6 million parameters on a limited Shape Net dataset.
Researcher Affiliation Academia Ye Mao Junpeng Jing Krystian Mikolajczyk Imperial College London
Pseudocode No The paper describes methods with steps and includes one formula (Equation 1) but does not present any explicitly labeled pseudocode blocks or algorithms.
Open Source Code Yes The supplementary material includes the complete code for image generation, model training, and evaluation.
Open Datasets Yes We first evaluated Open Dlign under the zero-shot shape classification task on four benchmark datasets: Model Net40 [46], Scan Object NN [47], Omni Object3D [48], and Objaverse-LVIS [23]. Point cloud sizes are 10,000 points for Model Net40 and Objaverse-LVIS, 2,048 for Scan Object NN, and 4,096 for Omni Object3D.
Dataset Splits No The paper mentions using datasets like Model Net40 and Objaverse-LVIS and training on Shape Net, but it does not specify explicit training, validation, and test splits (e.g., percentages or sample counts for each split).
Hardware Specification Yes The multimodal alignment was achieved by fine-tuning 10 epochs on an A100-80 GB GPU... The entire generation process for Shape Net [21] spanned 16 days on 8 RTX 6000 GPUs.
Software Dependencies No The paper mentions 'implemented in Py Torch' and uses 'Control Net v1.1 [40]' and 'Open CLIP-Vi T-H-14' but does not provide specific version numbers for PyTorch or other general software dependencies, which would be necessary for full reproducibility.
Experiment Setup Yes The multimodal alignment was achieved by fine-tuning 10 epochs on an A100-80 GB GPU, employing the Adam W optimizer and the One Cycle scheduler with a peak learning rate of 3 10 4 and a batch size of 128.