reproducibilityindex.ai

Improving Crowded Object Detection via Copy-Paste

Authors: Jiangfan Deng, Dewen Fan, Xiaosong Qiu, Feng Zhou

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our approach can easily improve the state-of-the-art detector in typical crowded detection task by more than 2% without any bells and whistles.
Researcher Affiliation	Industry	Algorithm Research, Aibee Inc. jfdeng100@foxmail.com, {dwfan,xsqiu,fzhou}@aibee.com
Pseudocode	Yes	Algorithm 1: Overlay Depth-aware NMS
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	Pedestrian detection is the most typical task burdened by the crowdedness problem, so our experiments are conducted mainly on two datasets: Crowd Human (Shao et al. 2018) and City Persons (Zhang, Benenson, and Schiele 2017). [...] we prepare another sparse training set by re-labeling full body box of persons in COCO (Lin et al. 2014) to further evaluate the potential of our method. We name this train set as COCO-fullperson (we will release this dataset). Moreover, we use the category of car in KITTI (Geiger, Lenz, and Urtasun 2012) to further estimate the generality.
Dataset Splits	Yes	Since both the training and validation data hold the same level of crowdedness, we prepare another sparse training set by re-labeling full body box of persons in COCO (Lin et al. 2014) to further evaluate the potential of our method. [...] Table 1: Results on Crowd Human val set.
Hardware Specification	Yes	We train the networks on 8 Nvidia V100 GPUs with 2 images on each GPU.
Software Dependencies	No	The paper mentions using "Mask R-CNN (He et al. 2017) model adopting Res Net-50 (He et al. 2016) as backbone" but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	During training, the short side of each image is resized to 800 and the long side is limited within 1400. Models are trained for 60k iterations starting from an initial learning rate of 0.02 (Faster R-CNN) or 0.01 (Retina Net) and is reduced by 0.1 on 30k and 40k iters respectively.