SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking

Authors: Jinlong Peng, Zhengkai Jiang, Yueyang Gu, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results demonstrate the effectiveness of Siam RCR and its superiority over the state-of-the-art competitors on GOT10k, La SOT, Tracking Net, OTB-2015, VOT-2018 and VOT-2019. Moreover, our Siam RCR runs at 65 FPS, far above the real-time requirement.
Researcher Affiliation Collaboration 1Tencent Youtu Lab 2Kyoto University 3Shanghai Jiao Tong University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It describes the method using figures and prose.
Open Source Code No The paper states 'Our algorithm is implemented by Python 3.6 and Py Torch 1.1.0.' but does not provide a specific repository link, explicit code release statement, or indicate code in supplementary materials for the described methodology.
Open Datasets Yes The whole network is optimized by Stochastic Gradient Descent (SGD) with momentum 0.9 on the datasets of GOT-10k [Huang et al., 2019], Trackineg Net [M uller et al., 2018], COCO [Lin et al., 2014], La SOT [Fan et al., 2019], Image Net VID [Russakovsky et al., 2015] and Image Net DET [Russakovsky et al., 2015].
Dataset Splits No The paper mentions training and test subsets for datasets (e.g., 'We train Siam RCR only on the train subset which consists of about 10,000 sequences and test it on the test subset of 180 sequences.'), but does not explicitly provide specific details for a separate validation dataset split.
Hardware Specification Yes The experiments are conducted on a server with Intel(R) Xeon(R) CPU E5-2680 v4 2.40GHz, and a NVIDIA Tesla P40 24GB GPU with CUDA 10.1.
Software Dependencies Yes Our algorithm is implemented by Python 3.6 and Py Torch 1.1.0. The experiments are conducted on a server with Intel(R) Xeon(R) CPU E5-2680 v4 2.40GHz, and a NVIDIA Tesla P40 24GB GPU with CUDA 10.1.
Experiment Setup Yes We totally train the network for 20 epochs. The batch size is 128. The learning rate is from 0.000001 to 0.1 in the first 5 epochs for warm-up and from 0.1 to 0.0001 with cosine schedule in the last 15 epochs. We freeze the backbone in the first 10 epochs and fine-tune it in the other 10 epochs with a reduced learning rate (multiplying 0.1). The size of exemplar image and search image are 127*127 and 255*255, respectively.