Spatio-Temporal Difference Descriptor for Skeleton-Based Action Recognition

Authors: Chongyang Ding, Kai Liu, Jari Korhonen, Evgeny Belyaev1227-1235

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on several action datasets show that our proposed method achieves up to 3% accuracy improvement over state-of-the-art methods.
Researcher Affiliation Academia 1 School of Computer Science and Technology, Xidian University 2 School of Computer Science and Software Engineering, Shenzhen University 3 International Laboratory Computer Technologies, ITMO University
Pseudocode No The paper describes the proposed methods using mathematical formulas and text, but no explicit pseudocode or algorithm block is provided.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes G3D-Gaming dataset (Bloom, Makris, and Argyriou 2012) contains 20 different gaming action classes performed by 10 different subjects... HDM05 dataset (M uller et al. 2007) contains 130 action classes executed by five different subjects... Northwestern-UCLA dataset (Wang et al. 2014) is captured by three Kinect cameras with different views.
Dataset Splits Yes Following the crosssubject test setting, samples performed by half of the subjects are used for training and the remaining half of the samples are used for testing. Following the standard evaluation protocol, we randomly select half of the sequences for training and the rest for testing. We follow the standard evaluation protocol where samples from the first two cameras are used for training and samples captured by the last camera are used for testing.
Hardware Specification Yes All experiments were conducted on Matlab deep learning toolbox with a TITANXP GPU.
Software Dependencies No The paper mentions "Matlab deep learning toolbox" and "Adam optimization algorithm" but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes In our experiments, the human skeletal sequence length is set to 100, 50 and 50 for G3D-Gaming, HDM05 and Northwestern-UCLA, respectively. The batch size is set to 30, the learning rate is fixed as 0.01, the directional weights are initialized as random rotation matrices, the row dimensional weights are initialized as random semi-orthogonal matrices and the rectification threshold ϵ is set to 1. The LSTM layer contains 100 neurons and it is trained using Adam optimization algorithm (Kingma and Ba 2014).