Optical Flow in Deep Visual Tracking

Authors: Mikko Vihlman, Arto Visala12112-12119

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The VOT toolkit (Kristan et al. 2015) is used to validate and test the trackers. ... Final tests are done once for each architecture using the actual VOT test videos. ... Table 2 shows the validation results for the architectures. ... Table 3 shows the results for the full architectures, Table 4 for the lighter architectures...
Researcher Affiliation Academia Mikko Vihlman, Arto Visala Department of Electrical Engineering and Automation, Aalto University PO Box 15500, FIN-00076 Aalto, Finland {mikko.vihlman, arto.visala}@aalto.fi
Pseudocode No The paper describes its methods and architectures in text and with diagrams (Figure 1 and Table 1) but does not provide any pseudocode or formal algorithm blocks.
Open Source Code No The paper states 'To make the trackers directly comparable to the GOTURN tracker (Held, Thrun, and Savarese 2016), the implementation builds on the original code of GOTURN,' but it does not provide any statement or link indicating that the authors' modifications or their full implementation is open-source or publicly available.
Open Datasets Yes In this paper, training and validation is done using the ILSVRC (Russakovsky et al. 2015) dataset for object detection in video (Image Net Video) in order to evaluate the trackers using VOT. ... The original GOTURN tracker (Held, Thrun, and Savarese 2016) was trained using about 300 videos from ALOV++ (Smeulders et al. 2014)...
Dataset Splits Yes There are 3862 videos for training and 555 for validation. ... Validation is done with the VOT toolkit using sequences generated from the validation videos of Image Net Video to choose the best settings for each architecture.
Hardware Specification No The paper states 'The inputs of the networks are prepared using CPU, other computations are done on GPU,' but it does not specify any particular CPU or GPU models, memory details, or other specific hardware specifications used for running the experiments.
Software Dependencies No The paper states 'The proposed trackers are implemented using the Caffe deep learning framework (Jia et al. 2014),' but it does not specify a version number for Caffe or any other software dependencies with their versions.
Experiment Setup Yes Training is done for 450,000 iterations using batches of 50 examples. ... The base learning rate of the parameters applied to the flow input is set to 0, 1e-5 or 1e-6 (fix or finetune). Fully-connected layers are always updated using a base learning rate of 1e-5. The convolutional layers connected to the image inputs have a learning rate of 0; that is, their parameters are fixed. Learning rates of all layers are divided by 10 after every 100,000 iterations.