Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GenHMR: Generative Human Mesh Recovery

Authors: Muhammad Usama Saleem, Ekkasit Pinyoanuntapong, Pu Wang, Hongfei Xue, Srijan Das, Chen Chen

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on benchmark datasets demonstrate that Gen HMR significantly outperforms stateof-the-art methods. ... We demonstrate through extensive experiments that Gen HMR outperforms SOTA methods on standard datasets, including Human3.6M in the controlled environment, and 3DPW and EMDB for in-the-wild scenarios. ... In this ablation study, we investigate how iterative refinement and mask-scheduling strategies influence the model s performance.
Researcher Affiliation Academia 1University of North Carolina at Charlotte, Charlotte, NC, USA 2University of Central Florida, Orlando, FL, USA EMAIL, EMAIL
Pseudocode No The paper describes the methodology in detail using figures and textual descriptions, including mathematical formulations for losses and attention mechanisms. However, it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes Project Website https://m-usamasaleem.github.io/ publication/Gen HMR/Gen HMR.html
Open Datasets Yes We trained the pose tokenizer using the AMASS (Mahmood et al. 2019) standard training split and MOYO (Tripathi et al. 2023). For Gen HMR, following prior work (Goel et al. 2023) and to ensure fair comparisons, we used standard datasets (SD): Human3.6M (H36M) (Ionescu et al. 2013), COCO (Lin et al. 2014), MPI-INF-3DHP (Mehta et al. 2017), and MPII (Andriluka et al. 2014). ... Gen HMR is tested on the Human3.6M testing split, following previous works (Kolotouros et al. 2019). To evaluate Gen HMR s generalization on challenging inthe-wild datasets with varying camera motions and diverse 3D poses, we test 3DPW (Von Marcard et al. 2018) and EMDB (Kaufmann et al. 2023) without training on them, ensuring a fair assessment on unseen data.
Dataset Splits Yes We trained the pose tokenizer using the AMASS (Mahmood et al. 2019) standard training split and MOYO (Tripathi et al. 2023). ... Gen HMR is tested on the Human3.6M testing split, following previous works (Kolotouros et al. 2019).
Hardware Specification Yes AITI is obtained on a single mid-grade GPU (NVIDIA RTX A5000).
Software Dependencies No The paper mentions various models and tools used (e.g., SMPL, VQ-VAE, vision transformer, Open Pose) and their corresponding research papers. However, it does not specify software dependencies with version numbers like programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes During inference, we use an iterative decoding process... The masked tokens are replaced with learnable [MASK] tokens... The number of tokens to be re-masked is determined by a masking schedule γ t / T L, where γ is a decaying function of iteration t... We adopt the cosine masking ratio function γ(τ) = cos πτ / 2... The final overall loss is Ltotal = Lmask + LSMPL + L3D + L2D, which combines pose token prediction loss (Lmask), 3D loss (L3D), 2D loss (L2D), and SMPL parameter loss LSMPL... This refinement process continues over P iterations and our experiments show that only a small number of iterations (5 to 10) is sufficient... Here, η controls the magnitude of the updates to the pose embeddings...