GlobalTrack: A Simple and Strong Baseline for Long-Term Tracking
Authors: Lianghua Huang, Xin Zhao, Kaiqi Huang11037-11044
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To verify the effectiveness of our approach, we conduct evaluation on four large-scale tracking benchmarks: La SOT (Fan et al. 2019), Tracking Net (Muller et al. 2018), TLP (Moudgil and Gandhi 2018) and Ox Uv A (Valmadre et al. 2018) |
| Researcher Affiliation | Academia | 1CRISE, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China huanglianghua2017@ia.ac.cn, {xzhao, kqhuang}@nlpr.ia.ac.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We use a combination of COCO (Lin et al. 2014), GOT-10k (Huang, Zhao, and Huang 2018) and La SOT (Fan et al. 2019) datasets for training our model |
| Dataset Splits | No | The paper mentions using training videos and test sets from public benchmarks but does not specify explicit training/validation/test dataset splits (e.g., percentages or sample counts) for their own experimental setup beyond what is inherent to the benchmarks themselves. |
| Hardware Specification | Yes | The training process takes about 16 hours on four GTX Titan X GPUs, while the online tracking runs at around 6 fps on a single gpu. |
| Software Dependencies | No | The paper states 'Our approach is implemented in Python, using Py Torch' but does not specify version numbers for Python, PyTorch, or any other ancillary software components. |
| Experiment Setup | Yes | Parameters We use Faster-RCNN with Res Net-50 backbone (Girshick 2015) as our base model for constructing query-guided RCNN. The channel number of the backbone features is c = 256. We set the output channel number of fx, fz and hx, hz to c = 256 as well. [...] The localization loss weight in Eq. (2) and Eq. (5) is set to λ = 1. Optimization We use stochastic gradient descent with a batch size of 4 pairs to train our model. The momentum and weight decay are set to 0.9 and 1 10 4, respectively. [...] We train our model for 12 epochs on COCO and another 12 epochs on a combination of COCO, GOT-10k and La SOT datasets, as described in the previous subsection. The initial learning rate is set to 0.01, and it decays with a factor of 0.1 at epoch 8 and 11. |