Detect or Track: Towards Cost-Effective Video Object Detection/Tracking

Authors: Hao Luo, Wenxuan Xie, Xinggang Wang, Wenjun Zeng8803-8810

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Although being light-weight and simple in structure, the scheduler network is more effective than the frame skipping baselines and flow-based approaches, as validated on Image Net VID dataset in video object detection/tracking.
Researcher Affiliation Collaboration Hao Luo,1 Wenxuan Xie,2 Xinggang Wang,1 Wenjun Zeng2 1School of Electronic Information and Communications, Huazhong University of Science and Technology 2Microsoft Research Asia {luohao, xgwang}@hust.edu.cn, {wenxie, wezeng}@microsoft.com
Pseudocode Yes Algorithm 1 The Detect or Track (Dor T) Framework
Open Source Code No The paper does not provide an unambiguous statement or link for the open-source code of the described methodology.
Open Datasets Yes All experiments are conducted on the Image Net VID dataset (Russakovsky et al. 2015).
Dataset Splits No The paper mentions 'Image Net VID is split into a training set of 3862 videos and a test set of 555 videos.' but does not provide specific details for a separate validation split. Although it refers to 'validation set' when reporting results, the split details are not provided.
Hardware Specification Yes All experiments are conducted on a workstation with an Intel Core i7-4790k CPU and a Titan X GPU.
Software Dependencies No The paper mentions software components like R-FCN, ResNet101, Siam FC, AlexNet, and SGD optimizer, but does not provide specific version numbers for these or other software dependencies required for reproduction.
Experiment Setup Yes The SGD optimizer is adopted with a learning rate 1e-2, momentum 0.9 and weight decay 5e4. The batch size is set to 32. During testing, we raise the decision threshold of track to δ = 0.97.