Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Direct Multi-view Multi-person 3D Pose Estimation
Authors: tao wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show experimentally that our Mv P model outperforms the state-of-the-art methods on several benchmarks while being much more efficient. Notably, it achieves 92.3% AP25 on the challenging Panoptic dataset, improving upon the previous best approach [40] by 9.8%. Comprehensive experiments on 3D pose benchmarks Panoptic [19], as well as Shelf and Campus [1] demonstrate our Mv P works very well. |
| Researcher Affiliation | Collaboration | Tao Wang1,2 , Jianfeng Zhang2 , Yujun Cai1, Shuicheng Yan1, Jiashi Feng1, 1Sea AI Lab 2National University of Singapore, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture and training process in text and diagrams, but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and models are available at https://github.com/sail-sg/mvp. |
| Open Datasets | Yes | Datasets Panoptic [20] is a large-scale benchmark with 3D skeleton joint annotations. Shelf and Campus [1] are two multi-person datasets capturing indoor and outdoor environments, respectively. |
| Dataset Splits | No | The paper mentions splitting data into training and testing sets ('Following Voxel Pose [40], we use the same data sequences except 160906_band3 in the training set due to broken images.' and 'We split them into training and testing sets following [1, 6, 40].') but does not explicitly define a separate validation dataset split with specific percentages or counts. |
| Hardware Specification | Yes | GPU: Ge Force RTX 2080 Ti CPU: i7-6900K @ 3.20GHz. For all methods, the time is counted on GPU Ge Force RTX 2080 Ti and CPU Intel i7-6900K @ 3.20GHz. |
| Software Dependencies | No | The paper mentions using Adam optimizer and building upon ResNet-50 for feature extraction, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The model is trained for 40 epochs, with the Adam optimizer of learning rate 10 4. During inference, a confidence threshold of 0.1 is used to filter out redundant predictions. Please refer to supplementary for more implementation details. ... Unless otherwise stated, we use a stack of six transformer decoder layers. |