LSTD: A Low-Shot Transfer Detector for Object Detection

Authors: Hao Chen, Yali Wang, Guoyou Wang, Yu Qiao

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we examine our LSTD on a number of challenging low-shot detection experiments, where LSTD outperforms other state-of-the-art approaches.
Researcher Affiliation Academia 1Shenzhen Key Laboratory of Virtual Reality and Human Interaction Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China 2Huazhong University of Science and Technology, China 3The Chinese University of Hong Kong, Hong Kong
Pseudocode Yes Algorithm 1 Regularized Transfer Learning of LSTD
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes Since our LSTD is a low-shot detector within a regularized transfer learning framework, we adopt a number of detection benchmarks, i.e., COCO (Lin et al. 2014), Image Net2015 (Deng et al. 2009), VOC2007 and VOC2010 (Everingham et al. 2010), respectively as source and target of three transfer tasks (Table 1).
Dataset Splits No The paper specifies training images per class (1/2/5/10/30) and mentions using standard test sets for evaluation, but it does not explicitly provide details about a separate validation set split (e.g., percentages or counts).
Hardware Specification No The paper states 'All our experiments are performed on Caffe (Jia et al. 2014)' but does not provide any specific details about the hardware used (e.g., GPU models, CPU types, or memory).
Software Dependencies No The paper mentions using 'Caffe (Jia et al. 2014)' and 'Adam (Kingma and Ba 2015)' for experiments but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes In the source domain, we feed 32 training images into LSTD for each mini-batch in task 1/2/3... In the target domain, all the training settings are the same as the ones in the source domain, except that 64/64/64 proposals are selected to train the (K+1)-object classifier, the background depression regularization is used on conv5 3, the temperature parameter in the transfer-knowledge regularization is 2... and the weight coefficients for both background depression and transfer-knowledge are 0.5. Finally, the optimization strategy for both source and target is Adam..., where the initial learning rate is 0.0002 (with 0.1 decay), the momentum/momentum2 is 0.9/0.99, and the weight decay is 0.0001.