VideoCapsuleNet: A Simplified Network for Action Detection
Authors: Kevin Duarte, Yogesh Rawat, Mubarak Shah
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed network achieves state-of-the-art performance on multiple action detection datasets including UCF-Sports, J-HMDB, and UCF-101 (24 classes) with an impressive 20% improvement on UCF-101 and 15% improvement on J-HMDB in terms of v-m AP scores. |
| Researcher Affiliation | Academia | Kevin Duarte kevin_duarte@knights.ucf.edu Yogesh S Rawat yogesh@crcv.ucf.edu Mubarak Shah shah@crcv.ucf.edu Center for Research in Computer Vision University of Central Florida Orlando, FL 32816 |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. |
| Open Datasets | Yes | We measure the performance of our network on three datasets UCF-Sports [15], J-HMDB [16], UCF-101 [17]. |
| Dataset Splits | Yes | The UCF-Sports dataset consists of 150 videos from 10 action classes. All videos contain spatio-temporal annotations in the form of frame-level bounding boxes and we follow the standard training/testing split used by [21]. |
| Hardware Specification | Yes | Although capsule networks tend to be computationally expensive (due to the routing-by-agreement), capsule-pooling allows Video Capsule Net to run on a single Titan X GPU using a batch size of 8. |
| Software Dependencies | No | We implement Video Capsule Net using Tensorflow [12]. |
| Experiment Setup | Yes | The network was trained using the Adam optimizer [14], with a learning rate of 0.0001. Due to the size of the Video Capsule Net, a batch size of 8 was used during training. |