reproducibilityindex.ai

Knowledge Integration Networks for Action Recognition

Authors: Shiwen Zhang, Sheng Guo, Limin Wang, Weilin Huang, Matthew Scott12862-12869

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed KINet achieves the state-of-the-art performance on a large-scale action recognition benchmark Kinetics-400, with a top-1 accuracy of 77.8%. We further demonstrate that our KINet has strong capability by transferring the Kinetics-trained model to UCF-101, where it obtains 97.8% top-1 accuracy.
Researcher Affiliation	Collaboration	1Malong Technologies, Shenzhen, China 2Shenzhen Malong Artiﬁcial Intelligence Research Center, Shenzhen, China 3State Key Lab for Novel Software Technology, Nanjing University, China
Pseudocode	No	The paper describes algorithms and modules in text and diagrams, but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets	Yes	To verify the effectiveness of our KINet, we conduct experiments on a large-scale action recognition dataset Kinetics-400 (Carreira and Zisserman 2017), which contains 400 action categories, with about 240k videos for training and 20k videos for validation. We then examine the generalization ability of our KINet by transferring the learned representation to a small dataset UCF-101 (Soomro, Zamir, and Shah 2012), containing 101 action categories with 13,320 videos in total.
Dataset Splits	Yes	To verify the effectiveness of our KINet, we conduct experiments on a large-scale action recognition dataset Kinetics-400 (Carreira and Zisserman 2017), which contains 400 action categories, with about 240k videos for training and 20k videos for validation. [...] For UCF-101, we follow (Wang et al. 2016a) to ﬁne tune the pretrained weights on Kinetics, where we have all but the ﬁrst batch normalization layer frozen and the model is trained for 80 epochs. Inference. For fair comparison, we also follow (Wang et al. 2016a) by uniformly sampling 25 segments from each video and select one frame out of each segment.
Hardware Specification	No	The paper does not specify the hardware used for the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies	No	The paper mentions various models and datasets but does not list specific software dependencies with version numbers (e.g., Python version, library versions).
Experiment Setup	Yes	We utilize SGD optimizer with initial learning rate set to 0.01, which drops by 10 at epoch 20, 40 and 60. The model is totally trained for 70 epochs. We set the weight decay to be 10 5 and the momentum to be 0.9.