Motion-Aware Heatmap Regression for Human Pose Estimation in Videos
Authors: Inpyo Song, Jongmin Lee, Moonwook Ryu, Jangwon Lee
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our motion-aware heatmap regression on Pose Track(2018, 21) and Sub-JHMDB datasets. Our results validate that the proposed motion-aware heatmaps significantly improve the precision of human pose estimation in videos, particularly in challenging scenarios such as videos like sports game footage with substantial human motions. |
| Researcher Affiliation | Academia | 1Department of Immersive Media Engineering, Sungkyunkwan University, Republic of Korea 2Electronics and Telecommunications Research Institute, Republic of Korea |
| Pseudocode | No | The paper describes its method in detail using text and mathematical formulations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | (Code and related materials are available at https://github.com/ Songinpyo/MTPose.) |
| Open Datasets | Yes | Pose Track For our evaluation, we utilized two versions of the Pose Track dataset: Pose Track2018 [Iqbal et al., 2017] and Pose Track21 [Doering et al., 2022]. These datasets are key benchmarks in multi-person pose estimation and tracking within video contexts. Sub-JHMDB The Sub-JHMDB dataset [Jhuang et al., 2013] is a subset of the larger JHMDB collection. |
| Dataset Splits | Yes | We evaluated our MTPose against leading video-based human pose estimation methods on Pose Track validation sets (AP metric) and Sub-JHMDB dataset (PCK metric). |
| Hardware Specification | Yes | The training setup involves a loss weigth γ of 3/4, a batch size of 64, a learning rate of 3e-4 using the Adam W optimizer, and is conducted on a single NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions using a Vi T Backbone and an off-the-shelf optical flow model, but does not provide specific version numbers for software dependencies like programming languages, libraries (e.g., PyTorch, TensorFlow), or other frameworks. |
| Experiment Setup | Yes | The model operates on images resized to 256 x 192 image size. We used the Vi T Backbone initialized from pretrained by [Xu et al., 2022b] and finetuned on Pose Track2018, Pose Track21 and Sub-JHMDB datasets. For the frame interval, we set it to 2 for Pose Track2018 and Sub-JHMDB, while Pose Track21 uses an interval of 1... We have set the motion threshold δ and the default standard deviation σ0 for Gaussian heatmaps at 3 for all datasets. The training setup involves a loss weigth γ of 3/4, a batch size of 64, a learning rate of 3e-4 using the Adam W optimizer... |