reproducibilityindex.ai

Beta R-CNN: Looking into Pedestrian Detection from Another Perspective

Authors: Zixuan Xu, Banghuai Li, Ye Yuan, Anhong Dang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the extremely crowded benchmark Crowd Human [1] and City Persons [2] show that our proposed approach can outperform the state-of-the-art results, which strongly validate the superiority of our method.
Researcher Affiliation	Collaboration	Zixuan Xu Peking University zixuanxu@pku.edu.cn; Banghuai Li Megvii Research libanghuai@megvii.com; Ye Yuan Megvii Research yuanye@megvii.com; Anhong Dang Peking University ahdang@pku.edu.cn
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	No	Code will be released at github.com/Guardian44x/Beta-R-CNN.
Open Datasets	Yes	City Persons Dataset. The City Persons dataset [2] is a subset of Cityscapes which only consists of person annotations. There are 2975 images for training, 500 and 1575 images for validation and testing. The average number of pedestrians in an image is 7. We evaluate our proposed method under the full-body setting, following the evaluation protocol in [2], and the partition of validation set follows the standard setting in [19] on account of visibility: Heavy [0, 0.65], Partial [0.65, 0.9], Bare [0.9, 1], Reasonable [0.65, 1]. Crowd Human Dataset. The Crowd Human dataset [1], has been recently released to speciﬁcally target the crowd issue in the human detection task. There are 15000, 4370, and 5000 images in the training, validation, and testing set respectively.
Dataset Splits	Yes	City Persons Dataset. There are 2975 images for training, 500 and 1575 images for validation and testing...Crowd Human Dataset. There are 15000, 4370, and 5000 images in the training, validation, and testing set respectively.
Hardware Specification	Yes	We take Crowd Human validation set with 800x1400 input size to conduct speed experiments on NVIDIA 2080Ti GPU with 8 GPUs, and the average speeds are 0.483s/image ( Cascade R-CNN baseline) and 0.487s/image (Beta R-CNN) respectively.
Software Dependencies	No	No specific software dependencies with version numbers were mentioned.
Experiment Setup	Yes	As for anchor settings, we follow the same anchor scales in [30], while the aspect ratios are set to H : W = {1 : 1, 2 : 1, 3 : 1}. For training, the batch size is 16, split to 8 GPUs. Each training round includes 16000 iterations on City Persons and 40000 iterations on Crowd Human. The learning rate is initialized to 0.02 and divided by 10 at half and three-quarter of total iterations respectively. During training, the sampling ratio of positive to negative proposals for Ro I branch is 1 : 1 for Crowd Human and 1 : 4 for City Persons.