Restoring Negative Information in Few-Shot Object Detection

Authors: Yukuan Yang, Fangyun Wei, Miaojing Shi, Guoqi Li

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on Image Net-LOC and PASCAL VOC show our method substantially improves the state-of-the-art few-shot object detection solutions.
Researcher Affiliation Collaboration Yukuan Yang Tsinghua University yyk17@mails.tsinghua.edu.cn Fangyun Wei Microsoft Research Asia fawe@microsoft.com Miaojing Shi King s College London miaojing.shi@kcl.ac.uk Guoqi Li Tsinghua University liguoqi@mail.tsinghua.edu.cn
Pseudocode No The paper describes the method in text and provides architectural diagrams but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/yang-yk/NP-Rep Met.
Open Datasets Yes Extensive experiments on Image Net-LOC [1] and PASCAL VOC 2007 [10] demonstrate that our method substantially improves the SOTA (i.e. up to +11% on Image Net-LOC and +19% on PASCAL VOC).
Dataset Splits Yes For classes in the Image Net-LOC benchmark, they are mostly animals and birds species. 100 classes are selected as base (seen) classes for training while 214 classes are considered as new (unseen) classes for testing. Following [1], we adopt its 5-way K {1, 5, 10} shot few-shot detection setting. For benchmark PASCAL VOC 2007, 15 out of 20 VOC classes are selected for training, the rest 5 are for testing. We use same splits as in [10, 12, 11] and carry out K {1, 2, 3, 5, 10} shot detection.
Hardware Specification No Our network is trained with synchronized stochastic gradient descent (SGD) over 4 GPUs with mini-batch of 4 images (1 image per GPU).
Software Dependencies No The paper mentions using Res Net-101 as backbone with DCN and FPN, but it does not specify software versions (e.g., PyTorch 1.x, TensorFlow 2.x, CUDA 10.x).
Experiment Setup Yes The total epoch number is 20 and the learning rate is initialized as 0.01 and then divided by 10 at epochs 4, 6 and 15. The weight decay and momentum parameters are set as 10 4 and 0.9, respectively. NMS with threshold 0.7 is used to eliminate duplicated proposals generated by RPN. The top-2000 proposals will be used for category and location prediction. Last, soft-NMS [39] with threshold 0.6 is applied on the output as post-processing to merge duplicated bounding boxes.