Spatiotemporal Residual Networks for Video Action Recognition

Authors: Christoph Feichtenhofer, Axel Pinz, Richard Wildes

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our novel spatiotemporal Res Net using two widely used action recognition benchmarks where it exceeds the previous state-of-the-art.
Researcher Affiliation Academia Christoph Feichtenhofer Graz University of Technology feichtenhofer@tugraz.at Axel Pinz Graz University of Technology axel.pinz@tugraz.at Richard P. Wildes York University, Toronto wildes@cse.yorku.ca
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our method has been implemented in Mat Conv Net [28] and we share our code and models at https://github.com/feichtenhofer/st-resnet.
Open Datasets Yes We evaluate our approach on two challenging action recognition datasets. First, we consider UCF101 [22], which consists of 13320 videos showing 101 action classes... Second, we consider HMDB51 [15], which has 6766 videos that show 51 different actions...
Dataset Splits No For both datasets, we use the provided evaluation protocol and report mean average accuracy over three splits into training and test sets. We found that lowering the number of samples used for batch normalization can further improve the generalization performance of the model. For example, for the appearance stream we use a low batch size of 4 for moment estimation during training. This practice strongly supports generalization of the model and nontrivially increases validation accuracy ( 4% on UCF101).
Hardware Specification Yes For inference, we average the predictions of the fully connected layers (without softmax) over all spatiotemporal locations... which takes 250ms on a Titan X GPU.
Software Dependencies No Our method has been implemented in Mat Conv Net [28]. (The paper mentions the software name but does not provide a specific version number for it or other software dependencies.)
Experiment Setup Yes Our method has been implemented in Mat Conv Net [28] and we share our code and models at https://github.com/feichtenhofer/st-resnet. We train our model in three optimization steps with the parameters listed in Table 2.