Learning Comprehensive Motion Representation for Action Recognition
Authors: Mingyu Wu, Boyuan Jiang, Donghao Luo, Junchi Yan, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xiaokang Yang2934-2942
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our method on three large-scale benchmark datasets, i.e., Something-Something V1 & V2 (Goyal et al. 2017) and Kinetics-400 (Kay et al. 2017). Furthermore, hyperparameter in our method is discussed. We also conduct ablation study on the temporal reasoning dataset Something Something V1 to analyze CME and SME s performance individually and visualize each part s effect. Finally, we give runtime analysis to show the efficiency of our method compared with state-of-the-art methods. |
| Researcher Affiliation | Collaboration | 1 Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University 2 Department of Computer Science and Engineering, Shanghai Jiao Tong University 3 Youtu Lab, Tencent |
| Pseudocode | No | The paper describes the proposed modules and framework using text and mathematical equations, but it does not include formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We test our method on three large-scale benchmark datasets, i.e., Something-Something V1 & V2 (Goyal et al. 2017) and Kinetics-400 (Kay et al. 2017). |
| Dataset Splits | Yes | The subscripts of Val and Test indicate dataset version and top-1 accuracy is reported. |
| Hardware Specification | Yes | We follow the inference settings in (Lin, Gan, and Han 2019) by using a single NVIDIA Tesla V100 GPU to measure the latency and throughput. |
| Software Dependencies | No | The paper mentions using Res Net-50 and ImageNet pre-training but does not specify software versions for libraries like PyTorch, TensorFlow, or Python. |
| Experiment Setup | Yes | For the Something Something dataset, we train the model for 50 epochs, set the initial learning rate to 0.01 and reduce it by a factor of 10 at 30, 40, 45 epochs. For Kinetics-400, our model is trained for 100 epochs. The initial learning rate is set to 0.01 and will be reduced by a factor of 10 at 50, 75, and 90 epochs. Stochastic Gradient Decent (SGD) with momentum 0.9 is utilized as the optimizer, and the batch size is 64 for all three datasets. |