reproducibilityindex.ai

Direct Multi-view Multi-person 3D Pose Estimation

Authors: tao wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show experimentally that our Mv P model outperforms the state-of-the-art methods on several benchmarks while being much more efficient. Notably, it achieves 92.3% AP25 on the challenging Panoptic dataset, improving upon the previous best approach [40] by 9.8%. Comprehensive experiments on 3D pose benchmarks Panoptic [19], as well as Shelf and Campus [1] demonstrate our Mv P works very well.
Researcher Affiliation	Collaboration	Tao Wang1,2 , Jianfeng Zhang2 , Yujun Cai1, Shuicheng Yan1, Jiashi Feng1, 1Sea AI Lab 2National University of Singapore, twangnh@gmail.com, zhangjianfeng@u.nus.edu, {caiyj,yansc,fengjs}@sea.com
Pseudocode	No	The paper describes the model architecture and training process in text and diagrams, but does not provide structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code and models are available at https://github.com/sail-sg/mvp.
Open Datasets	Yes	Datasets Panoptic [20] is a large-scale benchmark with 3D skeleton joint annotations. Shelf and Campus [1] are two multi-person datasets capturing indoor and outdoor environments, respectively.
Dataset Splits	No	The paper mentions splitting data into training and testing sets ('Following Voxel Pose [40], we use the same data sequences except 160906_band3 in the training set due to broken images.' and 'We split them into training and testing sets following [1, 6, 40].') but does not explicitly define a separate validation dataset split with specific percentages or counts.
Hardware Specification	Yes	GPU: Ge Force RTX 2080 Ti CPU: i7-6900K @ 3.20GHz. For all methods, the time is counted on GPU Ge Force RTX 2080 Ti and CPU Intel i7-6900K @ 3.20GHz.
Software Dependencies	No	The paper mentions using Adam optimizer and building upon ResNet-50 for feature extraction, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup	Yes	The model is trained for 40 epochs, with the Adam optimizer of learning rate 10 4. During inference, a conﬁdence threshold of 0.1 is used to ﬁlter out redundant predictions. Please refer to supplementary for more implementation details. ... Unless otherwise stated, we use a stack of six transformer decoder layers.