reproducibilityindex.ai

VideoCapsuleNet: A Simplified Network for Action Detection

Authors: Kevin Duarte, Yogesh Rawat, Mubarak Shah

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed network achieves state-of-the-art performance on multiple action detection datasets including UCF-Sports, J-HMDB, and UCF-101 (24 classes) with an impressive 20% improvement on UCF-101 and 15% improvement on J-HMDB in terms of v-m AP scores.
Researcher Affiliation	Academia	Kevin Duarte kevin_duarte@knights.ucf.edu Yogesh S Rawat yogesh@crcv.ucf.edu Mubarak Shah shah@crcv.ucf.edu Center for Research in Computer Vision University of Central Florida Orlando, FL 32816
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets	Yes	We measure the performance of our network on three datasets UCF-Sports [15], J-HMDB [16], UCF-101 [17].
Dataset Splits	Yes	The UCF-Sports dataset consists of 150 videos from 10 action classes. All videos contain spatio-temporal annotations in the form of frame-level bounding boxes and we follow the standard training/testing split used by [21].
Hardware Specification	Yes	Although capsule networks tend to be computationally expensive (due to the routing-by-agreement), capsule-pooling allows Video Capsule Net to run on a single Titan X GPU using a batch size of 8.
Software Dependencies	No	We implement Video Capsule Net using Tensorﬂow [12].
Experiment Setup	Yes	The network was trained using the Adam optimizer [14], with a learning rate of 0.0001. Due to the size of the Video Capsule Net, a batch size of 8 was used during training.