Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition
Authors: Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, Jian Yang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate our model, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D. The experimental results demonstrate the effectiveness of our proposed model and the improvement over the state-of-the-art. |
| Researcher Affiliation | Academia | Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, Jian Yang School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China School of Biological Science & Medical Engineering, Southeast University, Nanjing, China {zhen.cui, cyx, csjyang}@njust.edu.cn; {lichaolong, wenming zheng}@seu.edu.cn |
| Pseudocode | No | The paper provides mathematical formulations of the model but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code, such as a repository link or an explicit statement about code release. |
| Open Datasets | Yes | We conduct experiments on four public skeleton based action datasets: Florence 3D (Seidenari et al. 2013), HDM05 (Müller et al. 2007), Large Scale Combined dataset (Zhang et al. 2016) and NTU RGB+D (Shahroudy et al. 2016). |
| Dataset Splits | Yes | We follow the standard experimental settings to perform leave-one-subject-out cross-validation (Wang et al. 2016a). ... we conduct two types of experiments by following two widely-used protocols. Firstly, we use two subjects bd and mm for training and the remaining three for testing (Wang et al. 2015a). Secondly, to fairly compare the current deep learning methods, we conduct 10 random evaluations, each of which randomly selects half of the sequences for training and the rest for testing (Huang and Van Gool 2017). ... we conduct experiments using two standard settings, i.e., random cross subject evaluation and random cross sample evaluation. For each action, half of the subjects/samples are randomly selected for training while the rest for testing. ... We follow the two types of standard evaluation protocols (Shahroudy et al. 2016), i.e., cross-view evaluation, cross-subject evaluation, to perform experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory amounts) used for the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, etc.). |
| Experiment Setup | Yes | In the default case, the dimension of output signals is d = 32. For deep STGC, we empirically observe that stacking two layers is good enough to these datasets, thus we only employ two-layer network for Deep STGC. The outputs of each layer in deep STGC are 32, 64 dimensions. The most important factor of our model is the scale of convolutional kernels, which is analyzed in the next Section 4.3. ... we set K1 = 2 and K2 = 6 as default for multi-scale STGCK(indep.), STGCK(dep.)... we normalize/clip L = L/λmax (λmax = 2), Wk,ii ≥ 0, Σk |Wk|1 < 1 after the gradient update at each iteration, to make sure the model stability. The outputs of recursive model are concatenated into a softmax layer for classification. |