Temporal Adaptive RGBT Tracking with Modality Prompt
Authors: Hongyu Wang, Xiaotao Liu, Yifan Li, Meng Sun, Dian Yuan, Jing Liu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three popular RGBT tracking benchmarks show that our method achieves state-of-the-art performance, while running at real-time speed. Ablation Study To verify the effectiveness of the main components, we perform a detailed ablation study on the Las He R dataset. |
| Researcher Affiliation | Academia | Hongyu Wang, Xiaotao Liu*, Yifan Li, Meng Sun, Dian Yuan, Jing Liu Guangzhou Institute of Technology, Xidian University, Guangzhou, China 22171214782@stu.xidian.edu.cn, xtliu@xidian.edu.cn, 18066899461@163.com, sunmeng2002@163.com, d1anskr@163.com, neouma@163.com |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | We choose the Las He R (Li et al. 2021) dataset for fine-tuing our TATrack. RGBT234 (Li et al. 2019), and RGBT210 (Li et al. 2017). |
| Dataset Splits | No | The paper mentions training and testing datasets but does not explicitly describe a separate validation dataset split with specific percentages, sample counts, or defined methodology for splitting data for validation purposes. |
| Hardware Specification | Yes | The models are trained on 2 NVIDIA RTX 3090 GPUs and the inference speed is tested on a single NVIDIA RTX3090 GPU. |
| Software Dependencies | No | TATrack is implemented in Python using Py Torch. The paper mentions software names but does not provide specific version numbers for Python or PyTorch, which are necessary for reproducible software dependencies. |
| Experiment Setup | Yes | Each GPU holds 32 image pairs, resulting in a global batch size of 64. The model fine-tuning takes 25 epochs, and each epoch contains 6 * 10^4 sample pairs. We train our model by Adam W optimizer (Loshchilov and Hutter 2017) with the weight decay 10^-4. The initial learning rate is set to 1 * 10^-4 and decreased by the factor of 10 after 10 epochs. The search regions and templates are resized to 128x128 and 256x256, respectively. |