Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
VideoCapsuleNet: A Simplified Network for Action Detection
Authors: Kevin Duarte, Yogesh Rawat, Mubarak Shah
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed network achieves state-of-the-art performance on multiple action detection datasets including UCF-Sports, J-HMDB, and UCF-101 (24 classes) with an impressive 20% improvement on UCF-101 and 15% improvement on J-HMDB in terms of v-m AP scores. |
| Researcher Affiliation | Academia | Kevin Duarte EMAIL Yogesh S Rawat EMAIL Mubarak Shah EMAIL Center for Research in Computer Vision University of Central Florida Orlando, FL 32816 |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. |
| Open Datasets | Yes | We measure the performance of our network on three datasets UCF-Sports [15], J-HMDB [16], UCF-101 [17]. |
| Dataset Splits | Yes | The UCF-Sports dataset consists of 150 videos from 10 action classes. All videos contain spatio-temporal annotations in the form of frame-level bounding boxes and we follow the standard training/testing split used by [21]. |
| Hardware Specification | Yes | Although capsule networks tend to be computationally expensive (due to the routing-by-agreement), capsule-pooling allows Video Capsule Net to run on a single Titan X GPU using a batch size of 8. |
| Software Dependencies | No | We implement Video Capsule Net using Tensor๏ฌow [12]. |
| Experiment Setup | Yes | The network was trained using the Adam optimizer [14], with a learning rate of 0.0001. Due to the size of the Video Capsule Net, a batch size of 8 was used during training. |