XFormer: Fast and Accurate Monocular 3D Body Capture

Authors: Lihui Qian, Xintong Han, Faqiang Wang, Hongyu Liu, Haoye Dong, Zhiwen Li, Huawei Wei, Zhe Lin, Cheng-Bin Jin

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments in Section 5.2 demonstrate that this framework significantly outperforms each individual branch that only captures a single modality.
Researcher Affiliation Collaboration 1Huya Inc 2Hong Kong University of Science and Technology 3Carnegie Mellon University 4Tencent
Pseudocode No The paper describes the proposed system and its components in detail using text and figures, but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing its source code or a link to a code repository.
Open Datasets Yes As discussed in Section 2 and summarized in Table 2, common datasets can be divided into the following categories: 1) Image datasets with 3D annotations, such as 3DPW, UP3D [Lassner et al., 2017], Mu Co-3DHP [Mehta et al., 2018]; 2) Image datasets with 2D keypoints annotations, such as COCO [Lin et al., 2014], MPII [Andriluka et al., 2014]; 3) Image datasets with 2D keypoints annotations and pseudo 3D human labels, such as SPIN fits on COCO, Pose2Mesh fits on Human3.6M; 4) Mo Cap datasets without images, such as AMASS [Mahmood et al., 2019].
Dataset Splits No The paper states, 'We evaluate on the Human3.6M and 3DPW datasets following the protocols in [Kanazawa et al., 2018; Kolotouros et al., 2019]' but does not explicitly provide the specific train/validation/test splits used for reproduction (e.g., exact percentages or sample counts).
Hardware Specification Yes Our system with a light backbone takes less than 7ms per frame for an input person on an Nvidia GTX 1660 GPU and 30ms with a single thread of Intel i7-8700 CPU, obtaining significant speedup while maintaining satisfactory accuracy.
Software Dependencies No The paper mentions 'All CPU models are accelerated with Open Vino', but does not specify version numbers for OpenVINO or any other software dependencies such as deep learning frameworks or programming languages.
Experiment Setup No While the paper describes data augmentation strategies (random rotations, shifting, and scaling), it does not explicitly provide concrete hyperparameter values such as learning rate, batch size, or optimizer settings in the main text.