Topology-Aware Convolutional Neural Network for Efficient Skeleton-Based Action Recognition
Authors: Kailin Xu, Fanfan Ye, Qiaoyong Zhong, Di Xie2866-2874
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on four widely used datasets, i.e. N-UCLA, SBU, NTU RGB+D and NTU RGB+D 120 to verify the effectiveness of Ta-CNN. |
| Researcher Affiliation | Collaboration | Kailin Xu1* , Fanfan Ye 2 , Qiaoyong Zhong 2, Di Xie 2 , 1 Zhejiang University 2 Hikvision Research Institute kailinxu@zju.edu.cn, {yefanfan, zhongqiaoyong, xiedi}@hikvision.com |
| Pseudocode | No | The paper provides mathematical formulations and architectural diagrams but no structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating the release of open-source code for the described methodology. |
| Open Datasets | Yes | Our method is evaluated on four widely used datasets, i.e. NTU RGB+D, NTU RGB+D 120, Northwestern-UCLA and SBU Kinect Interaction. [...] NTU RGB+D contains 56,880 samples of 60 classes performed by 40 distinct subjects. [...] NTU RGB+D 120 is an extension of NTU RGB+D containing 114,480 samples. [...] Northwestern-UCLA (N-UCLA) is captured by using three Kinect cameras. It contains 1494 samples of 10 classes. [...] SBU Kinect Interaction (SBU) depicts human interaction captured by Kinect. It contains 282 sequences covering 8 action classes. |
| Dataset Splits | Yes | NTU RGB+D contains 56,880 samples of 60 classes performed by 40 distinct subjects. There are two recommended benchmarks, i.e. cross-subject (X-sub) and cross-view (X-view). [...] NTU RGB+D 120 [...] There are two benchmarks, i.e. cross-subject (X-sub) and cross-setup (X-setup). [...] Subject-independent 5-fold cross-validation is performed following the same evaluation protocol as previous works (Yun et al. 2012). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments. |
| Software Dependencies | No | The proposed model is implemented in Py Torch (Paszke et al. 2017). No specific version numbers for software dependencies are provided. |
| Experiment Setup | Yes | The model is trained for 800 epochs with the Adam optimizer. The learning rate is set to 0.001 initially and decayed by a factor of 0.1 at 650, 730 and 770 epochs for all datasets. The weight decay is set to 0.0002 for SBU and 0.0001 for the rest datasets. The batch sizes for NTU RGB+D, NTU RGB+D 120, N-UCLA and SBU are 64, 64, 16 and 8 respectively. As for data pre-processing method, we follow previous works (Shi et al. 2019; Cheng et al. 2020b; Li et al. 2018). For SBU, a warmup strategy is used during the first 30 epochs. The λ in Eq. (8) is set to 0.6, and the mixing ratio α for Skeleton Mix is set to 1/16. |