PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
Authors: Siyuan Huang, Yixin Chen, Tao Yuan, Siyuan Qi, Yixin Zhu, Song-Chun Zhu
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on SUN RGB-D dataset show that the proposed method significantly outperforms existing RGB-based approaches for 3D object detection. |
| Researcher Affiliation | Academia | Siyuan Huang Department of Statistics huangsiyuan@ucla.edu Yixin Chen Department of Statistics ethanchen@ucla.edu Tao Yuan Department of Statistics taoyuan@ucla.edu Siyuan Qi Department of Computer Science syqi@cs.ucla.edu Yixin Zhu Department of Statistics yixin.zhu@ucla.edu Song-Chun Zhu Department of Statistics sczhu@stat.ucla.edu |
| Pseudocode | No | The paper describes algorithms and formulations in text and mathematical equations but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states: 'We implement our framework based on the code of Massa and Girshick [83].' This indicates they used an existing framework, but there is no explicit statement that their specific implementation of Perspective Net is open-source or a link to its repository. |
| Open Datasets | Yes | We conduct comprehensive experiments on SUN RGB-D [46] dataset. |
| Dataset Splits | No | The paper mentions 4783 training images and 4220 test images but does not explicitly describe a validation dataset split or its size. |
| Hardware Specification | Yes | We use SGD for optimization with a batch size of 32 on a desktop with 4 Nvidia TITAN RTX cards (8 images each card). |
| Software Dependencies | No | The paper states, 'We implement our framework based on the code of Massa and Girshick [83].' While this implies the use of PyTorch (as indicated by the reference title 'maskrcnn-benchmark...in PyTorch'), it does not specify exact version numbers for PyTorch or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | We resize the images so that the shorter edges are all 800 pixels. To avoid over-fitting, a data augmentation procedure is performed by randomly flipping the images or randomly shifting the 2D bounding boxes with corresponding labels during the training. We use SGD for optimization with a batch size of 32 [...] The learning rate starts at 0.01 and decays by 0.1 at 30,000 and 35,000 iterations. |