DeRPN: Taking a Further Step toward More General Object Detection

Authors: Lele Xie, Yuliang Liu, Lianwen Jin, Zecheng Xie9046-9053

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments conducted on both general object detection datasets (Pascal VOC 2007, 2012 and MS COCO) and scene text detection datasets (ICDAR 2013 and COCO-Text) all prove that our De RPN can significantly outperform RPN.
Researcher Affiliation Academia Lele Xie, Yuliang Liu, Lianwen Jin, Zecheng Xie School of Electronic and Information Engineering, South China University of Technology xie.lele@mail.scut.edu.cn, liu.yuliang@mail.scut.edu.cn, eelwjin@scut.edu.cn, zchengxie@gmail.com
Pseudocode No The paper describes the methods and equations but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes The code has been released at https://github.com/HCIILAB/De RPN.
Open Datasets Yes Comprehensive experiments conducted on both general object detection datasets (Pascal VOC 2007, 2012 and MS COCO) and scene text detection datasets (ICDAR 2013 and COCO-Text) all prove that our De RPN can significantly outperform RPN.
Dataset Splits Yes Experiments on MS COCO In addition, we verified our method on MS COCO 2017, which consists of a training set ( 118k images), test set ( 20k images) and validation set (5k images).
Hardware Specification Yes We also evaluated the inference time on a single TITAN XP GPU.
Software Dependencies No The paper mentions general frameworks like CNN but does not specify software names with version numbers for reproducibility.
Experiment Setup Yes The settings of RPN, training, and testing followed that of (Ren et al. 2015)... We set the anchor strings as a geometric progression (denoted as {an}), i.e., (16, 32, 64, 128, 256, 512, 1024)... β is used to adjust magnitude of interval, which is intended to be 0.1 in our experiments... λ is a balancing parameter for Lcls and Lreg, which is empirically set to 10. For each scale, we randomly sample at most 30 positive and negative anchor strings to form a mini-batch.