Context-Transformer: Tackling Object Confusion for Few-Shot Detection

Authors: Ze Yang, Yali Wang, Xianyu Chen, Jianzhuang Liu, Yu Qiao12653-12660

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we evaluate Context-Transformer on the challenging settings of few-shot detection and incremental few-shot detection. The experimental results show that, our framework outperforms the recent state-of-the-art approaches.
Researcher Affiliation Collaboration 1Shen Zhen Key Lab of Computer Vision and Pattern Recognition, SIAT-Sense Time Joint Lab, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 2Huawei Noah s Ark Lab 3SIAT Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any statement about making source code publicly available or a link to a code repository.
Open Datasets Yes First, we set VOC07+12 as our target-domain task. ... Second, we choose a source-domain benchmark for pretraining. ... we remove 20 categories of COCO that are overlapped with VOC, and use the rest 60 categories of COCO as source-domain data.
Dataset Splits Yes First, we set VOC07+12 as our target-domain task. The few-shot training set consists of N images (per category) that are randomly sampled from the original train/val set. Unless stated otherwise, N is 5 in our experiments.
Hardware Specification Yes Finally, we implement our approach with Py Torch (Paszke et al. 2017), where all the experiments run on 4 Titan Xp GPUs.
Software Dependencies No The paper mentions "Py Torch (Paszke et al. 2017)" but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes For fine-tuning in the target domain, we set the implementation details where the batch size is 64, the optimization is SGD with momentum 0.9, the initial learning rate is 4 10 3 (decreased by 10 after 3k and 3.5k iterations), the weight decay is 5 10 4, the total number of training iterations is 4k.