reproducibilityindex.ai

Cascade RPN: Delving into High-Quality Region Proposal Network with Adaptive Convolution

Authors: Thang Vu, Hyunjun Jang, Trung X. Pham, Chang Yoo

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiments are performed on the COCO 2017 detection dataset [26]. All the models are trained on the train split (115k images). The region proposal performance and ablation analysis are reported on val split (5k images), and the benchmarking detection performance is reported on test-dev split (20k images).
Researcher Affiliation	Academia	Thang Vu, Hyunjun Jang, Trung X. Pham, Chang D. Yoo Department of Electrical Engineering Korea Advanced Institute of Science and Technology {thangvubk,wiseholi,trungpx,cd_yoo}@kaist.ac.kr
Pseudocode	Yes	Algorithm 1. Cascade RPN
Open Source Code	Yes	The code is made publicly available at https://github.com/thangvubk/Cascade-RPN.
Open Datasets	Yes	The experiments are performed on the COCO 2017 detection dataset [26].
Dataset Splits	Yes	All the models are trained on the train split (115k images). The region proposal performance and ablation analysis are reported on val split (5k images), and the benchmarking detection performance is reported on test-dev split (20k images).
Hardware Specification	Yes	It takes about 12 hours for the models to converge on 8 Tesla V100 GPUs.
Software Dependencies	No	The models are implemented with Py Torch [29] and mmdetection [8]. Specific version numbers for these software components are not provided.
Experiment Setup	Yes	The model consists of two stages, with Res Net50-FPN [24] being its backbone. A single anchor per location is used with size of 322, 642, 1282, 2562, and 5122 corresponding to the feature levels C2, C3, C4, C5, and C6, respectively [24]. The ﬁrst stage uses the anchor-free metric for sample discrimination with the thresholds of the center-region σctr and ignore-region σign, which are adopted from [40, 37], being 0.2 and 0.5. The second stage uses the anchor-based metric with the Io U threshold of 0.7. The multi-task loss is set with the stage-wise weight α1 = α2 = 1 and the balance term λ = 10. The NMS threshold is set to 0.8. In all experiments, the long edge and the short edge of the images are resized to 1333 and 800 respectively without changing the aspect ratio. No data augmentation is used except for standard horizontal image ﬂipping... The models are trained with 8 GPUs with a batch size of 16 (two images per GPU) for 12 epochs using SGD optimizer. The learning rate is initialized to 0.02 and divided by 10 after 8 and 11 epochs.