Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Generalizable 3D Human Pose Estimation via Ensembles on Flat Loss Landscapes
Authors: Jumin Han, Jun-Hui Kim, Seong-Whan Lee
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that our approach improves the generalization capability of 3D HPE models, and can be easily applied, regardless of model architecture, with consistent performance gains. Our method enhances performances of the model for the representative model architectures (MLP, CNN, GCN, and Transformer) of 3D HPE in benchmark datasets such as Human3.6M [14], MPI-INF-3DHP [24], 3DPW [30], and BEDLAM [2]. |
| Researcher Affiliation | Academia | Jumin Han Department of Artificial Intelligence Korea University, Seoul, South Korea EMAIL Jun-Hee Kim Department of Artificial Intelligence Korea University, Seoul, South Korea EMAIL Seong-Whan Lee Department of Artificial Intelligence Korea University, Seoul, South Korea EMAIL |
| Pseudocode | No | The paper describes methods verbally and with mathematical equations (Eq. 1, 2, 3, 4) and diagrams (Figure 4), but does not present structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Answer: [Yes] Justification: The code is included in the supplementary material. |
| Open Datasets | Yes | Our method enhances performances of the model for the representative model architectures (MLP, CNN, GCN, and Transformer) of 3D HPE in benchmark datasets such as Human3.6M [14], MPI-INF-3DHP [24], 3DPW [30], and BEDLAM [2]. |
| Dataset Splits | Yes | We utilize the data from subjects 1, 5, 6, 7, and 8 as training set, while the data from subjects 9 and 11 are utilized as test set following the literature of 3D HPE. |
| Hardware Specification | Yes | Answer: [Yes] Justification: We explain the GPU we used in the supplementary material. |
| Software Dependencies | No | The main text of the paper does not explicitly list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA x.x). |
| Experiment Setup | Yes | The results show that applying SAM to a 3D HPE model yielded no performance gain, which confirmed our hypothesis. Note that the models are trained for a longer duration because of the slow convergence of SAM and the perturbation radius is set as 0.05 for SAM training. ... To illustrate this, we compare the training loss trajectories of the model with and without adaptive scaling mechanism during 20 epochs on H36M [14] in Figure 7. |