GlobalTrack: A Simple and Strong Baseline for Long-Term Tracking

Authors: Lianghua Huang, Xin Zhao, Kaiqi Huang11037-11044

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To verify the effectiveness of our approach, we conduct evaluation on four large-scale tracking benchmarks: La SOT (Fan et al. 2019), Tracking Net (Muller et al. 2018), TLP (Moudgil and Gandhi 2018) and Ox Uv A (Valmadre et al. 2018)
Researcher Affiliation Academia 1CRISE, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China huanglianghua2017@ia.ac.cn, {xzhao, kqhuang}@nlpr.ia.ac.cn
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We use a combination of COCO (Lin et al. 2014), GOT-10k (Huang, Zhao, and Huang 2018) and La SOT (Fan et al. 2019) datasets for training our model
Dataset Splits No The paper mentions using training videos and test sets from public benchmarks but does not specify explicit training/validation/test dataset splits (e.g., percentages or sample counts) for their own experimental setup beyond what is inherent to the benchmarks themselves.
Hardware Specification Yes The training process takes about 16 hours on four GTX Titan X GPUs, while the online tracking runs at around 6 fps on a single gpu.
Software Dependencies No The paper states 'Our approach is implemented in Python, using Py Torch' but does not specify version numbers for Python, PyTorch, or any other ancillary software components.
Experiment Setup Yes Parameters We use Faster-RCNN with Res Net-50 backbone (Girshick 2015) as our base model for constructing query-guided RCNN. The channel number of the backbone features is c = 256. We set the output channel number of fx, fz and hx, hz to c = 256 as well. [...] The localization loss weight in Eq. (2) and Eq. (5) is set to λ = 1. Optimization We use stochastic gradient descent with a batch size of 4 pairs to train our model. The momentum and weight decay are set to 0.9 and 1 10 4, respectively. [...] We train our model for 12 epochs on COCO and another 12 epochs on a combination of COCO, GOT-10k and La SOT datasets, as described in the previous subsection. The initial learning rate is set to 0.01, and it decays with a factor of 0.1 at epoch 8 and 11.