Attribute-Based Progressive Fusion Network for RGBT Tracking

Authors: Yun Xiao, MengMeng Yang, Chenglong Li, Lei Liu, Jin Tang2831-2838

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on benchmark datasets demonstrate the effectiveness of our APFNet against other state-of-the-art methods.
Researcher Affiliation Academia 1Information Materials and Intelligent Sensing Laboratory of Anhui Province, Hefei, China 2Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Hefei, China 3School of Computer Science and Technology, Anhui University, Hefei, China 4School of Artificial Intelligence, Anhui University, Hefei, China xiaoyun@ahu.edu.cn, xiuxiaoran@163.com, lcl1314@foxmail.com, liulei970507@163.com, tangjin@ahu.edu.cn
Pseudocode No The paper describes the algorithms in text and diagrams, but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes Code will be available at https://github.com/mmiclcl/source-code.
Open Datasets Yes In most of existing datasets such as GTOT (Li et al. 2016) and RGBT234 (Li et al. 2019a), each attribute is manually annotated for each video frame. It supports us to train each attribute-specific fusion branch individually. ...GTOT dataset includes 50 RGBT video pairs... RGBT234 is a larger RGBT tracking dataset... Las He R is the largest RGBT tracking dataset at present, which contains 1224 aligned video sequences...
Dataset Splits Yes For the testing on GTOT dataset, we train our attribute-specific fusion branches with corresponding challenge-based training data extracted from RGBT234 dataset by challenge labels. Then, we use the entire dataset of RGBT234 to train the attribute-based aggregation SKNet and the enhancement fusion transformer module. While for the testing on RGBT234 and Las He R datasets, we exchange training and test sets, in other words, our training dataset is GTOT, and training process is similar to the mentioned above. ...Las He R is the largest RGBT tracking dataset at present, which contains 1224 aligned video sequences including more diverse attribute annotations, in which 245 sequence are divided separately as testing datasets, and the remaining are designed for training datasets.
Hardware Specification Yes Our tracker is implemented in pytorch 1.0.1, python 3.7, CUDA 10.2 and runs on a computer with 8 NVIDIA Ge Force RTX 1080Ti GPU cards.
Software Dependencies Yes Our tracker is implemented in pytorch 1.0.1, python 3.7, CUDA 10.2 and runs on a computer with 8 NVIDIA Ge Force RTX 1080Ti GPU cards.
Experiment Setup Yes The learning rates are set to 0.001 and 0.0005 for attribute-specific fusion branches and FC6, respectively. Since the data under illumination variation is very small, the learning rate under this attribute-specific fusion branch is 0.002. The stochastic gradient descent (SGD) method is adopted as the optimization strategy with momentum of 0.9, and the weight attenuation is set to 0.0005. The number of training periods is 200.