Online Hybrid Lightweight Representations Learning: Its Application to Visual Tracking
Authors: Ilchae Jung, Minji Kim, Eunhyeok Park, Bohyung Han
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Compared to the state-of-the-art real-time trackers based on conventional deep neural networks, our tracking algorithm demonstrates competitive accuracy on the standard benchmarks with a small fraction of computational cost and memory footprint. We evaluate the proposed algorithm on online visual tracking, which requires online model updates and highly accurate bounding box alignments in real-time streaming data. Our algorithm achieves competitive performance in terms of accuracy and efficiency and, in particular, illustrates competency in the resource-aware environment. |
| Researcher Affiliation | Academia | Ilchae Jung1,3 , Minji Kim1,2 , Eunhyeok Park3 and Bohyung Han1,2 1ASRI, Seoul National University 2ECE & IPAI, Seoul National University 3CSE, POSTECH chey0313@postoch.ac.kr, minji@snu.ac.kr, eh.park@postech.ac.kr, bhhan@snu.ac.kr |
| Pseudocode | No | The paper does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | To pretrain the hybrid trackers, we employ the same training datasets as the baseline networks [Jung et al., 2018; Li et al., 2019; Chen et al., 2021]. We adopt the pretrained model of each baseline tracker for initialization and fine-tune them using the proposed loss function. The learned hybrid models are tested on the standard visual tracking benchmarks, OTB2015 [Wu et al., 2015], UAV123 [Mueller et al., 2016], La SOT [Fan et al., 2019], VOT2016 [Kristan et al., 2016], and VOT2018 [Kristan et al., 2018], for comparisons with recently proposed competitors. |
| Dataset Splits | No | The paper describes training datasets and evaluation metrics but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) or cite standard splits for all datasets. |
| Hardware Specification | Yes | Our algorithm is implemented in Py Torch with Intel I76850k and NVIDIA Titan Xp Pascal GPU for RT-MDNet and Quadro RTX4000 for Siam RPN++ and Trans T. |
| Software Dependencies | No | The paper mentions 'implemented in Py Torch' but does not provide specific version numbers for PyTorch or other software dependencies. |
| Experiment Setup | Yes | We employ the same learning rates as the baselines but reduce the rate of the quantized network to 1/10 of the original value in Hybrid Siam RPN++. As in [Jung et al., 2019], we use the full precision in the first convolution layer in Hybrid Siam RPN++ and Hybrid Trans T. For Hybrid RT-MDNet, we train the model with batch size 48 and run 120K iterations while optimizing model parameters and auxiliary parameters using SGD and Adam, respectively. For Hybrid Siam RPN++, training is performed on 8 GPUs for 100K iterations with batch size 22, and the same optimization methods are employed as Hybrid RT-MDNet. Hybrid Trans T is trained for 400 epochs (1K iterations per epoch) using the Adam W optimizer with batch size 40, where the learning rate is decayed by a factor of 10 after 250 epochs. |