DetNAS: Backbone Search for Object Detection

Authors: Yukang Chen, Tong Yang, Xiangyu Zhang, GAOFENG MENG, Xinyu Xiao, Jian Sun

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we show the effectiveness of Det NAS on various detectors, for instance, one-stage Retina Net and the two-stage FPN. We empirically find that networks searched on object detection shows consistent superiority compared to those searched on Image Net classification. The resulting architecture achieves superior performance than hand-crafted networks on COCO with much less FLOPs complexity.
Researcher Affiliation Collaboration 1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences 2Megvii Technology {yukang.chen, gfmeng, xinyu.xiao}@nlpr.ia.ac.cn {yangtong, zhangxiangyu, sunjian}@megvii.com
Pseudocode Yes We formulate the supernet training process in Algorithm 1 in the supplementary material. We formulate this process as Algorithm 2 in the supplementary material.
Open Source Code Yes Code and models have been made available at: https://github.com/megvii-model/Det NAS.
Open Datasets Yes For Image Net classification dataset, we use the commonly used 1.28M training images for supernet pre-training. We train on 8 GPUs with a total of 16 images per minibatch for 90k iterations on COCO and 22.5k iterations on VOC.
Dataset Splits Yes We split the detection datasets into a training set for supernet fine-tuning, a validation set for architecture search, and a test set for final evaluation. For VOC, the validation set contains 5k images randomly selected from trainval2007 + trainval2012 and the remains for supernet fine-tuning. For COCO, the validation set contains 5k images randomly selected from trainval35k [13] and the remains for supernet fine-tuning.
Hardware Specification Yes For the small search space, GPUs are GTX 1080Ti . For the large search space, GPUs are Tesla V100.
Software Dependencies No The paper mentions 'Detectron [6]' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For Image Net classification dataset, we use the commonly used 1.28M training images for supernet pre-training. To train the one-shot supernet backbone on Image Net, we use a batch size of 1024 on 8 GPUs for 300k iterations. We set the initial learning rate to be 0.5 and decrease it linearly to 0. The momentum is 0.9 and weight decay is 4 10 5. We train on 8 GPUs with a total of 16 images per minibatch for 90k iterations on COCO and 22.5k iterations on VOC. The initial learning rate is 0.02 which is divided by 10 at {60k, 80k} iterations on COCO and {15k, 20k} iterations on VOC. We use weight decay of 1 10 4 and momentum of 0.9.