Hierarchical Attentive Recurrent Tracking
Authors: Adam Kosiorek, Alex Bewley, Ingmar Posner
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluation of the proposed model is performed on two datasets: pedestrian tracking on the KTH activity recognition dataset and the more difficult KITTI object tracking dataset. Section 5 presents experiments on KTH and KITTI datasets with comparison to related attention-based trackers. |
| Researcher Affiliation | Academia | Adam R. Kosiorek Department of Engineering Science University of Oxford adamk@robots.ox.ac.uk Alex Bewley Department of Engineering Science University of Oxford bewley@robots.ox.ac.uk Ingmar Posner Department of Engineering Science University of Oxford ingmar@robots.ox.ac.uk |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and results are available online1. 1https://github.com/akosiorek/hart |
| Open Datasets | Yes | Evaluation of the proposed model is performed on two datasets: pedestrian tracking on the KTH activity recognition dataset and the more difficult KITTI object tracking dataset. |
| Dataset Splits | No | We split all sequences into 80/20 sequences for train and test sets, respectively. No explicit mention of a validation split percentage. |
| Hardware Specification | Yes | The donation from Nvidia of the Titan Xp GPU used in this work is also gratefully acknowledged. |
| Software Dependencies | No | The paper mentions 'RMSProp optimiser [9]' and 'Alex Net [1]', but does not provide specific version numbers for software libraries or frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | We use their pre-trained feature extractor. We follow the authors and set the glimpse size (h, w) = (28, 28). We replicate the training procedure exactly, with the exception of using the RMSProp optimiser [9] with learning rate of 3.33 10 5 and momentum set to 0.9. our feature map has the size of 14 14 384 with the input glimpse of size (h, w) = (56, 56). We apply dropout with probability 0.25 at the end of V1. We used 100 hidden units in the RNN with orthogonal initialisation and Zoneout [21] with probability set to 0.05. The system was trained via curriculum learning [2], by starting with sequences of length five and increasing sequence length every 13 epochs, with epoch length decreasing with increasing sequence length. We used the same optimisation settings, with the exception of the learning rate, which we set to 3.33 10 6. |