Discriminative and Robust Online Learning for Siamese Visual Tracking
Authors: Jinghao Zhou, Peng Wang, Haoyang Sun13017-13024
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments4.2 Comparison with state-of-the-art4.3 Ablation StudyTable 1: state-of-the-art comparison on two popular tracking benchmarks OTB2015 and VOT2018 with their running speed. |
| Researcher Affiliation | Academia | Jinghao Zhou, Peng Wang, Haoyang Sun School of Computer Science and School of Automation, Northwestern Polytechnical University, China National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, China jensen.zhoujh@gmail.com, peng.wang@nwpu.edu.cn, sunhaoyang@mail.nwpu.edu.cn |
| Pseudocode | Yes | Algorithm 1: Tracking algorithm |
| Open Source Code | Yes | Our method is implemented in Python with Py Torch, and the complete code and video demo will be made available at https://github.com/shallowtoil/DROL. |
| Open Datasets | Yes | OTB100, VOT2018, VOT2018-LT, UAV123, Tracking Net, and La SOT.OTB2015 (Wu, Lim, and Yang 2015)VOT2018 (Kristan et al. 2018) |
| Dataset Splits | Yes | The above hyper-parameters are set using VOT2018 as the validation set and are further evaluated in Section 5. |
| Hardware Specification | Yes | The speed is tested on Nvidia GTX 1080Ti GPU. |
| Software Dependencies | No | Our method is implemented in Python with Py Torch, and the complete code and video demo will be made available at https://github.com/shallowtoil/DROL. The paper mentions Python and PyTorch but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For the classification subnet, the first layer is a 1 1 convolutional layer with Re LU activation, which reduces the feature dimensionality to 64. The last layer employs a 4 4 kernel with a single output channel. (...) For online tuning, we use the region of size 255 255 of the first frame to pre-train the whole classifier. (...) The classifier is updated every 10 frame with a learning rate set to 0.01 and doubled once neighboured distractors are detected. To fuse classification scores, we set λ to 0.6 in DROL-FC and 0.8 in DROL-RPN and DROL-Mask. (...) we update the short-term template every T = 5 frames, while τc, υr, and υc are set to 0.75, 0.6, and 0.5 respectively. |