Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Action Recognition with Multi-stream Motion Modeling and Mutual Information Maximization
Authors: Yuheng Yang, Haipeng Chen, Zhenguang Liu, Yingda Lyu, Beibei Zhang, Shuang Wu, Zhibo Wang, Kui Ren
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct extensive experiments to empirically evaluate our method on three benchmark action recognition datasets. Empirically, our approach sets the new state-of-the-art performance on three benchmark datasets, NTU RGB+D, NTU RGB+D 120, and NW-UCLA. |
| Researcher Affiliation | Collaboration | Yuheng Yang1 , Haipeng Chen1 , Zhenguang Liu 2 , Yingda Lyu3 , Beibei Zhang5 , Shuang Wu4 , Zhibo Wang2 and Kui Ren 2 1College of Computer Science and Technology, Jilin University 2School of Cyber Science and Technology, Zhejiang University 3Public Computer Education and Research Center, Jilin University 4Black Sesame Technologies 5Zhejiang Lab |
| Pseudocode | No | The paper does not contain a section or figure explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | The implementations are released, hoping to facilitate future research. Our code is released to facilitate researchers. https://github.com/ActionR-Group/Stream-GCN |
| Open Datasets | Yes | We adopt three widely used action recognition datasets, namely NTU-RGB+D [Shahroudy et al., 2016], NTU-RGB+D 120 [Liu et al., 2019a], and Northwestern-UCLA [Wang et al., 2014], to evaluate the proposed method. |
| Dataset Splits | Yes | NTU-RGB+D. This dataset provides two sub-benchmarks: (1) Cross-Subject (X-Sub): data for 20 subjects is used as the training data, while the rest is used as test data. (2) Cross-View (X-View) divides the training and test sets according to different camera views. NTU-RGB+D 120. Within the dataset, two benchmarks are maintained: (1) Cross-Subject (X-Sub), which categorizes 53 subjects into the training class and the other 53 subjects into the test class. (2) Cross-Setup (X-Set), which arranges data items with even IDs into the training group and odd IDs into the test group. Northwestern-UCLA. We follow the evaluation protocol mentioned in [Wang et al., 2014], where videos collected by the first two cameras serve as the training samples and the rest serve as test samples. |
| Hardware Specification | Yes | We conduct experiments on a computer equipped with an Intel Xeon E5 CPU at 2.1GHz, three NVIDIA Ge Force GTX 1080 Ti GPUs, and the RAM of 64GB. |
| Software Dependencies | Yes | We leverage Py Torch 1.1 to implement our model. |
| Experiment Setup | Yes | We apply stochastic gradient descent (SGD) with 0.9 Nesterov momentum to train the Stream-GCN model. For NTU-RGB+D and NTU-RGB+D 120 datasets, the number of training epochs is set to 65 with the first 5 epochs being warm-up epochs, which help stabilize the training process. For NTU-RGB+D and NTU-RGB+D 120 datasets, the initial learning rate is set to 0.1 and decays by 0.1 every 35 epochs, the batch size is selected as 64. For the Northwestern-UCLA dataset, the initial learning rate is set to 0.01 and decays by 0.0001 every 50 epochs, the batch size is set to 16. |