3D Object Proposals for Accurate Object Class Detection
Authors: Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Andrew G. Berneshawi, Huimin Ma, Sanja Fidler, Raquel Urtasun
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show significant performance gains over existing RGB and RGB-D object proposal methods on the challenging KITTI benchmark. Combined with convolutional neural net (CNN) scoring, our approach outperforms all existing results on all three KITTI object classes. |
| Researcher Affiliation | Academia | 1Department of Electronic Engineering Tsinghua University 2Department of Computer Science University of Toronto |
| Pseudocode | No | The paper describes the inference and learning processes but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and data are online: http://www.cs.toronto.edu/ 3dop. |
| Open Datasets | Yes | We evaluate our approach on the challenging KITTI autonomous driving dataset [11], which contains three object classes: Car, Pedestrian, and Cyclist. |
| Dataset Splits | Yes | Since the test ground-truth labels are not available, we split the KITTI training set into train and validation sets (each containing half of the images). We ensure that our training and validation set do not come from the same video sequences, and evaluate the performance of our bounding box proposals on the validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like structured SVM, Fast R-CNN, Oxford Net, and Image Net but does not provide specific version numbers for these components. |
| Experiment Setup | Yes | We extend this basic network by adding a context branch after the last convolutional layer, and an orientation regression loss to jointly learn object location and orientation. ... The context regions are obtained by enlarging the candidate boxes by a factor of 1.5. We used smooth L1 loss [34] for orientation regression. We use Oxford Net [3] trained on Image Net to initialize the weights of convolutional layers and the branch for candidate boxes. The parameters of the context branch are initialized by copying the weights from the original branch. We then fine-tune it end to end on the KITTI training set. |