MetaAnchor: Learning to Detect Objects with Customized Anchors

Authors: Tong Yang, Xiangyu Zhang, Zeming Li, Wenqiang Zhang, Jian Sun

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiment on COCO detection task shows that Meta Anchor consistently outperforms the counterparts in various scenarios.
Researcher Affiliation Collaboration Tong Yang Xiangyu Zhang Zeming Li Wenqiang Zhang Jian Sun Megvii Inc (Face++) {yangtong,zhangxiangyu,lizeming,sunjian}@megvii.com Fudan University wqzhang@fudan.edu.cn
Pseudocode No The paper describes the methodology using text and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code No For YOLOv2 baseline, we use anchors showed on open source project4 to detect objects. 4https://github.com/pjreddie/darknet
Open Datasets Yes In this section we mainly evaluate our proposed Meta Anchor on COCO object detection task [24]. The basic detection framework is Retina Net [23] as introduced in 3.2, whose backbone feature extractor we use is Res Net-50 [13] pretrained on Image Net classification dataset [34].
Dataset Splits Yes Following the common practice [23] in COCO detection task, for training we use two different dataset splits: COCO-all and COCO-mini; while for test, all results are evaluated on the minival set which contains 5000 images. COCO-all includes all the images in the original training and validation sets excluding minival images, while COCO-mini is a subset of around 20000 images.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions frameworks like Retina Net and YOLOv2, and refers to a 'Darknet' open source project, but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes For fair comparison, we follow most of the settings in [23] (image size, learning rate, etc.) for all the experiments, except for a few differences as follows. In [23], 3 3 anchor boxes (i.e. 3 scales and 3 aspect ratios) are predefined for each level of detection head...the number of hidden neurons m is set to 128.