Sequential 3D Human Pose Estimation Using Adaptive Point Cloud Sampling Strategy
Authors: Zihao Zhang, Lei Hu, Xiaoming Deng, Shihong Xia
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the ITOP dataset and the NTURGBD dataset demonstrate that all of our contributed components are effective, and our method can achieve state-of-the-art performance. |
| Researcher Affiliation | Academia | 1Institute of Computing Technology, Chinese Academy of Sciences 2University of Chinese Academy of Sciences, 3Institute of Software, Chinese Academy of Sciences |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | supplementary material https://github.com/Hmslab/Adapose |
| Open Datasets | Yes | In our experiment, we use the ITOP dataset [Haque et al., 2016] and NTU-RGBD dataset [Shahroudy et al., 2016; Liu et al., 2019a] to evaluate our method. |
| Dataset Splits | No | The paper mentions using fully labeled and weakly labeled data for training and evaluates on the 'ITOP test dataset', but it does not explicitly provide details for a separate validation dataset split. |
| Hardware Specification | Yes | The running time of our method, V2V and WSM is 50.0, 3.5 and 24.4 FPS on a single NVIDIA 2080Ti GPU. |
| Software Dependencies | No | The paper mentions components like 'Adam optimizer', 'Point Net', and 'LSTM' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | During the training process, we use Adam optimizer with a learning rate of 0.0005 which is set to decay 0.05% every 1000 iterations. The bounding box size L is [1.8, 2, 1.5]. In our experiments, we set the weights λ3D, λ2D, λconsis and λsam as 10, 0.1, 1e-3 and 1. In the point cloud sampling module, we choose ϵ = 0.025, M = 4 in the sampling center generation step and 8-nearest neighbors in the projection step. In density-based sampling module, the original point clouds are fed into five 1D convolution layers, each of which is followed by a Re LU activation layer. The output dimensions of the five convolution layers are 64, 128, 256, 512, and 128, respectively. Then we use a fully connected layer that has 512 neurons to generate the sampling centers and the weights of original point clouds. |