Online Hybrid Lightweight Representations Learning: Its Application to Visual Tracking

Authors: Ilchae Jung, Minji Kim, Eunhyeok Park, Bohyung Han

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Compared to the state-of-the-art real-time trackers based on conventional deep neural networks, our tracking algorithm demonstrates competitive accuracy on the standard benchmarks with a small fraction of computational cost and memory footprint. We evaluate the proposed algorithm on online visual tracking, which requires online model updates and highly accurate bounding box alignments in real-time streaming data. Our algorithm achieves competitive performance in terms of accuracy and efficiency and, in particular, illustrates competency in the resource-aware environment.
Researcher Affiliation Academia Ilchae Jung1,3 , Minji Kim1,2 , Eunhyeok Park3 and Bohyung Han1,2 1ASRI, Seoul National University 2ECE & IPAI, Seoul National University 3CSE, POSTECH chey0313@postoch.ac.kr, minji@snu.ac.kr, eh.park@postech.ac.kr, bhhan@snu.ac.kr
Pseudocode No The paper does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets Yes To pretrain the hybrid trackers, we employ the same training datasets as the baseline networks [Jung et al., 2018; Li et al., 2019; Chen et al., 2021]. We adopt the pretrained model of each baseline tracker for initialization and fine-tune them using the proposed loss function. The learned hybrid models are tested on the standard visual tracking benchmarks, OTB2015 [Wu et al., 2015], UAV123 [Mueller et al., 2016], La SOT [Fan et al., 2019], VOT2016 [Kristan et al., 2016], and VOT2018 [Kristan et al., 2018], for comparisons with recently proposed competitors.
Dataset Splits No The paper describes training datasets and evaluation metrics but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) or cite standard splits for all datasets.
Hardware Specification Yes Our algorithm is implemented in Py Torch with Intel I76850k and NVIDIA Titan Xp Pascal GPU for RT-MDNet and Quadro RTX4000 for Siam RPN++ and Trans T.
Software Dependencies No The paper mentions 'implemented in Py Torch' but does not provide specific version numbers for PyTorch or other software dependencies.
Experiment Setup Yes We employ the same learning rates as the baselines but reduce the rate of the quantized network to 1/10 of the original value in Hybrid Siam RPN++. As in [Jung et al., 2019], we use the full precision in the first convolution layer in Hybrid Siam RPN++ and Hybrid Trans T. For Hybrid RT-MDNet, we train the model with batch size 48 and run 120K iterations while optimizing model parameters and auxiliary parameters using SGD and Adam, respectively. For Hybrid Siam RPN++, training is performed on 8 GPUs for 100K iterations with batch size 22, and the same optimization methods are employed as Hybrid RT-MDNet. Hybrid Trans T is trained for 400 epochs (1K iterations per epoch) using the Adam W optimizer with batch size 40, where the learning rate is decayed by a factor of 10 after 250 epochs.