WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection

Authors: Liang Peng, Senbo Yan, Boxi Wu, Zheng Yang, Xiaofei He, Deng Cai

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments validate the effectiveness of our method. In summary, our contributions can be listed as follows: Firstly, we explore a novel method (Weak M3D) towards weakly supervised monocular 3D detection, removing the reliance on 3D box labels. Secondly, we pose the main challenges in Weak M3D and correspondingly introduce four effective strategies to resolve them, including geometric alignment loss, ray tracing loss, loss balancing, and learning disentanglement. Thirdly, evaluated on the KITTI benchmark, our method builds a strong baseline for weakly supervised monocular 3D detection, which even outperforms some existing fully supervised methods which use massive 3D box labels.
Researcher Affiliation Collaboration 1State Key Lab of CAD&CG, Zhejiang University 2FABU Inc. {pengliang, yansenbo, wuboxi}@zju.edu.cn {yangzheng}@fabu.ai {xiaofeihe, dengcai}@cad.zju.edu.cn
Pseudocode No The paper describes the method and network architecture in text and diagrams (e.g., Figure 7), but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes Codes are available at https://github.com/SPeng Liang/Weak M3D.
Open Datasets Yes Like most prior fully supervised works do, we conduct experiments on KITTI Geiger et al. (2012) dataset. ... To obtain an initial object point cloud, we adopt Mask-RCNN He et al. (2017) pretrained on COCO Lin et al. (2014).
Dataset Splits Yes Following the common practice Chen et al. (2017a), the 7,481 samples are further divided into training and validation splits, containing 3,712 and 3,769 images, respectively.
Hardware Specification Yes Our method is implemented by Py Torch Paszke et al. (2019) and trained on a Titan V GPU.
Software Dependencies No The paper mentions 'PyTorch' but does not provide a specific version number or other software dependencies with their versions.
Experiment Setup Yes Our method is implemented by Py Torch Paszke et al. (2019) and trained on a Titan V GPU. We use the Adam optimizer Kingma & Ba (2014) with an initial learning rate of 10 4. We train our network for 50 epochs. ... We train our network with a batch size of 8 by default. ... For the frozen dimensions for cars, we empirically adopt 1.6, 1.8, 4.0 meters as the height, width, and length, respectively. R for the point density in Equation 3 is set to 0.4-meter.