Prototypical Variational Autoencoder for 3D Few-shot Object Detection

Authors: Weiliang Tang, Biqi YANG, Xianzhi Li, Yun-Hui Liu, Pheng-Ann Heng, Chi-Wing Fu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show the top performance of our approach over the state of the arts on two FS3D benchmarks. Quantitative ablations and qualitative prototype analysis further demonstrate that our probabilistic modeling can significantly boost prototype learning for FS3D. We conduct extensive experiments on two FS3D benchmarks, FS-Scan Net and FS-SUNRGB [10]. The results in Sec. 4.2 show that our method significantly outperforms the SOTA approaches in various few-shot settings. More analysis on prototypes in Sec. 4.3 demonstrate the effectiveness of leveraging VAE for prototypical learning.
Researcher Affiliation Academia Weiliang Tang* The Chinese University of Hong Kong wltang21@cse.cuhk.edu.hk Biqi Yang* The Chinese University of Hong Kong bqyang@cse.cuhk.edu.hk Xianzhi Li Huazhong University of Science and Technology xzli@hust.edu.cn Pheng-Ann Heng The Chinese University of Hong Kong pheng@cse.cuhk.edu.hk Yunhui Liu The Chinese University of Hong Kong yhliu@mae.cuhk.edu.hk Chi-Wing Fu Department of CSE and SHIAE The Chinese University of Hong Kong cwfu@cse.cuhk.edu.hk
Pseudocode No The paper describes the methods using prose and mathematical equations but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We conduct experiments on two FS3D benchmarks, FS-Scan Net and FS-SUNRGB [10]. The FS-Scan Net dataset consists of 18 object categories and 1,513 point cloud scenes in total. The FS-SUNRGBD dataset contains 5,000 point cloud scenes, covering 10 object categories. Reference [10] is provided for these datasets: 'Shizhen Zhao and Xiaojuan Qi. Prototypical votenet for few-shot 3d point cloud object detection. ar Xiv preprint ar Xiv:2210.05593, 2022.'.
Dataset Splits No The paper describes the 'K-shot setting' for novel classes and 'base/novel splits' for the datasets, which indicates how data is used for training and evaluation. However, it does not provide specific train/validation/test percentages or exact sample counts for a general dataset split, nor does it explicitly mention a dedicated 'validation' split with details required for full reproducibility.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments.
Software Dependencies No The paper mentions architectural components like 'Point Net++' and 'Folding Net' but does not specify any software dependencies with their version numbers (e.g., specific Python, PyTorch, or CUDA versions) required for reproduction.
Experiment Setup No The paper states: 'More details on the splits of both datasets, the network architecture, and the training scheme can be found in the Appendix.' This indicates that specific experimental setup details, such as hyperparameters or training configurations, are not provided in the main text.