reproducibilityindex.ai

Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN

Authors: Hang Xu, Linpu Fang, Xiaodan Liang, Wenxiong Kang, Zhenguo Li12492-12499

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that the proposed method significantly outperforms multiple-branch models and achieves the state-of-the-art results on multiple object detection benchmarks (m AP: 49.1% on COCO).
Researcher Affiliation	Collaboration	Hang Xu,1 Linpu Fang,2 Xiaodan Liang,3 Wenxiong Kang,2 Zhenguo Li1 1Huawei Noah s Ark Lab 2South China University of Technology 3Sun Yat-Sen University
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information for open-source code, such as a repository link or an explicit statement of code release.
Open Datasets	Yes	We evaluate the performance of our Universal-RCNN on three object detection domains with different annotations of categories: MSCOCO 2017 (Lin et al. 2014), Visual Genome(VG) (Krishna et al. 2016), and ADE (Zhou et al. 2017).
Dataset Splits	Yes	MSCOCO is a common object detection dataset with 80 object classes, which contains 118K training images, 5K validation images (denoted as minival) and 20K unannotated testing images (denoted as test-dev) as common practice. For VG, we use ... 88K images for training and 5K images for testing... For ADE, we consider 445 classes and use 20K images for training and 1K images for testing...
Hardware Specification	Yes	All experiments are conducted on a single server with 8 Tesla V100 GPUs by using the Pytorch framework.
Software Dependencies	No	The paper mentions using the 'Pytorch framework' but does not specify a version number or other software dependencies with their versions.
Experiment Setup	Yes	The hyper-parameters in training mostly follow Lin et al.. During both training and testing, we resize the input image such that the shorter side has 800 pixels. ... The total number of proposed regions after NMS is Nr = 512. ... In the graph learner module, we use a linear transformation layer of size 256 ... For the spatial-aware GCN, we use two weighted graph convolutional layers with dimensions of 256 and 128 respectively... Each GCN consists of K = 8 spatial weight terms... For training, SGD with weight decay of 0.0001 and momentum of 0.9 is adopted to optimize all models. The batch size is set to be 16 with 2 images on each GPU. The initial learning rate is 0.02, reduce twice (x0.1) during the training process. We train 12 epochs for all models in an end-to-end manner.