Geometry-Constrained Car Recognition Using a 3D Perspective Network

Authors: Zeng Rui, Ge Zongyuan, Denman Simon, Sridharan Sridha, Fookes Clinton1161-1168

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present qualitative and quantitative results on the vehicle classification and verification tasks in the Box Cars dataset. The results demonstrate that, by learning such a concise 3D representation, we can achieve superior performance to methods that only use 2D information while retain 3D meaningful information without the challenge of requiring a 3D CAD model.
Researcher Affiliation Academia 1Queensland University of Technology 2Monash University r3.zeng@hdr.qut.edu.au; zongyuan.ge@monash.edu; {s.denman, s.sridharan, c.fookes}@qut.edu.au
Pseudocode No The paper describes algorithms and networks (e.g., Global Network, 3D Perspective Network, Feature Fusion Network) and includes mathematical formulations for loss functions, but it does not contain any structured pseudocode blocks or sections explicitly labeled 'Algorithm'.
Open Source Code No The paper does not contain any explicit statements about making the source code available or provide a link to a code repository for the methodology described.
Open Datasets Yes To our best knowledge, the Box Cars dataset (Sochor, Herout, and Havel 2016) is only dataset which provides both 3D and 2D bounding box annotations for vehicle recognition in the computer vision community. Therefore, we use it to evaluate our model.
Dataset Splits Yes Regarding the classification task, the dataset is split into two subsets: Medium and Hard. The Hard protocol has 87 categories and contains 37,689 training images and 18,939 testing images. The Medium protocol is composed of 77 categories and has 40,152 and 19,590 images for training and testing respectively.
Hardware Specification Yes Each batch takes approximately 2s on a NVIDIA Tesla P100 GPU and in total the model takes about 12 hours to converge.
Software Dependencies No The paper mentions several software components and frameworks such as 'Retina Net', 'Mobile Net V2', 'Res Net101', 'SGD', and 'multi-modal compact bilinear (MCB) pooling', but it does not specify any version numbers for these or other software dependencies.
Experiment Setup Yes SGD is chosen as our optimizer and its momentum is set to 0.9. The initial learning rate is 0.02, and is divided by 10 after every 15 epochs. The batch size is set to 30. The model optimisation is ceases when training reaches 45 epochs.