Mutually Reinforced Spatio-Temporal Convolutional Tube for Human Action Recognition
Authors: Haoze Wu, Jiawei Liu, Zheng-Jun Zha, Zhenzhong Chen, Xiaoyan Sun
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show MRST-Net yields the best performance, compared to state-of-the-art approaches. |
| Researcher Affiliation | Collaboration | National Engineering Laboratory for Brain-inspired Intelligence Technology and Application, University of Science and Technology of China; School of Remote Sensing and Information Engineering, Wuhan University; Intelligent Multimedia Group, Microsoft Research Asia |
| Pseudocode | No | No structured pseudocode or algorithm blocks are present in the paper. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Three well-known benchmarks, i.e., Kinetics400[Kay et al., 2017], UCF-101[Soomro et al., 2012], and HMDB-51[Kuehne et al., 2013], are included in the evaluations. |
| Dataset Splits | No | The paper states 'Both UCF101 and HMDB51 are provided with 3 splits for training and testing' but does not explicitly detail a validation split or its size/percentage for any dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies, such as programming languages, libraries, or frameworks used. |
| Experiment Setup | Yes | Our data augmentation includes random clipping on both spatial (firstly resizing the smaller video side to 256 pixels, then randomly cropping a 224 224 patch) and temporal (randomly picking the starting frame among those early enough to guarantee a desired number of frames). Batch normalization is applied to all convolutional layers. We use the Adam Gradient Descent optimizer with an initial learning rate of 1e 4 to train the MRST-related networks from scratch. The drop out ratio and weight decay rate are set to 0.5 and 5e 5. |