PedHunter: Occlusion Robust Pedestrian Detector in Crowded Scenes

Authors: Cheng Chi, Shifeng Zhang, Junliang Xing, Zhen Lei, Stan Z. Li, Xudong Zou10639-10646

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To sufficiently verify the effectiveness of the proposed components, we construct ablation study on all four datasets including Crowd Human, City Persons, Caltech and SUR-PED. We first construct a baseline detector based on FPN (Lin et al. 2017) with Res Net-50 (He et al. 2016). The performance of the baseline model is shown in Table 1.
Researcher Affiliation Academia 1Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China 2CBSR & NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China 3University of Chinese Academy of Sciences, Beijing, China 4Macau University of Science and Technology, Macao, China
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The proposed dataset, source codes and trained models are available at https://github.com/Chi Cheng123/Ped Hunter.
Open Datasets Yes To facilitate further studies on the occluded pedestrian detection in surveillance scenes, we release a new pedestrian dataset, called SUR-PED, with a total of over 162k highquality manually labeled instances in 10k images. The proposed dataset, source codes and trained models are available at https://github.com/Chi Cheng123/Ped Hunter. Based on this proposed dataset and exiting datasets including Crowd Human (Shao et al. 2018), extented City Persons (Zhang, Benenson, and Schiele 2017) and extented Caltech-USA (Doll ar et al. 2009) with their additional head annotations (Chi et al. 2019), several experiments are conducted to demonstrate the superiority of the proposed method, especially for the crowded scenes.
Dataset Splits Yes SUR-PED dataset... contains 6, 000, 1, 000 and 3, 000 images for training, validation and testing subsets, respectively. Crowd Human dataset is divided into training (15, 000 images), validation (4, 370 images) and testing (5, 000 images) subsets.
Hardware Specification Yes The proposed Ped Hunter is trained on 16 GTX 1080Ti GPUs with a mini-batch 2 per GPU for Crowd Human, Caltech-USA and SUR-PED, and the minibatch size for Citypersons is 1 per GPU.
Software Dependencies No The paper mentions software components like ResNet-50 and FPN, and optimization method SGD, but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes We fine-tune the model using SGD with 0.9 momentum, 0.0001 weight decay. The proposed Ped Hunter is trained on 16 GTX 1080Ti GPUs with a mini-batch 2 per GPU for Crowd Human, Caltech-USA and SUR-PED, and the minibatch size for Citypersons is 1 per GPU. For the first 13 training epochs, the learning rate is set to 0.04, and we decrease it by a factor of 10 and 100 for another 4 and 3 epochs, respectively (for SUR-PED).