Recovering Complete Actions for Cross-dataset Skeleton Action Recognition

Authors: Hanchao Liu, Yujiang Li, Tai-Jiang Mu, Shi-min Hu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our approach on a cross-dataset setting with three skeleton action datasets, outperforming other domain generalization approaches by a considerable margin. We improve the average accuracy on unseen datasets by 5%, outperforming other baseline methods by a large margin.
Researcher Affiliation Academia Hanchao Liu1 Yujiang Li1 Tai-Jiang Mu1 Shi-Min Hu1 1BNRist, Department of Computer Science and Technology, Tsinghua University
Pseudocode Yes A1. Algorithm The algorithmic description of our recover-and-resample augmentation framework is provided in Algorithm 1.
Open Source Code Yes Code is available at https: //github.com/Hanchao Liu/Recover-and-Resample
Open Datasets Yes We use four large-scale datasets, i.e, NTU60-RGBD [41], PKU-MMD [25], ETRI-Activity3D [15] and Kinetics [5]. For term of use, NTU60-RGBD [41] is free for research and non-commercial use. We submitted a license agreement to ETRI-Activity3D [15] website for downloading the dataset. License for PKU-MMD [25] is not stated on its official homepage.
Dataset Splits No For skeleton action recognition, most works [42, 6] report best results on the test set since there is no official validation set. We also follow this evaluation protocol for our domain generalization task.
Hardware Specification Yes We implement our method and other baselines using Py Torch [36] and conduct experiments on a single NVIDIA RTX 2080Ti.
Software Dependencies No The paper mentions 'Py Torch [36]' but does not specify a version number.
Experiment Setup Yes We set training hyper-parameters the same as [42]. Furthermore, we set λT = 0.1, the number of linear transforms Ntr = 20, the number of background poses Nbkg = 10, and maug = 0.75. We set α = 0.1 for sampling tp. We fix the resampling method by randomly sampling a segment with length ratio r between 0.7 and 1.0. We use k-means for clustering boundary poses and linear transforms.