Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
$\text{Di}^2\text{Pose}$: Discrete Diffusion Model for Occluded 3D Human Pose Estimation
Authors: Weiquan Wang, Jun Xiao, Chunping Wang, Wei Liu, Zhao Wang, Long Chen
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive evaluations conducted on various benchmarks (e.g., Human3.6M, 3DPW, and 3DPW-Occ) have demonstrated its effectiveness. |
| Researcher Affiliation | Collaboration | Weiquan Wang1, Jun Xiao1, Chunping Wang2, Wei Liu3, Zhao Wang1, Long Chen4 1Zhejiang University 2Finvolution Group 3Tencent 4Hong Kong University of Science and Technology |
| Pseudocode | Yes | In this section, we provide complete training and inference algorithms for discrete diffusion process. Algorithm 1 Training Algorithm for the discrete diffusion process. Algorithm 2 Inference Algorithm for the discrete diffusion process. |
| Open Source Code | No | We will release code upon paper acceptance. |
| Open Datasets | Yes | Human3.6M [34] is the most extensive benchmark for 3D HPE... 3DPW [72] is the first dataset... Additionally, to further verify the occlusion-robustness, we evaluate Di2Pose on the 3DPW-Occ [83], which is a subset of the 3DPW. |
| Dataset Splits | Yes | We follow [22] with same protocol, which involves training on subjects S1, S5, S6, S7, and S8, and testing on subjects S9 and S11. |
| Hardware Specification | Yes | All experiments are carried out on one NVIDIA A100 PCIe GPU. |
| Software Dependencies | No | The proposed Di2Pose is completely implemented in Py Torch [53]. However, no specific version number for PyTorch or other software dependencies is provided. |
| Experiment Setup | Yes | Pose Quantization Step. The pose encoder is constructed with four Local-MLP blocks, while the pose decoder incorporates a single block. Within these Local-MLP blocks, the embedding dimensions D for the pose encoder and decoder are configured to 2048 and 512, respectively. For the quantization process, the projected vector qi features the channel d = 5. The levels per channel, denoted as [L1, , Ld], are specified as [7, 5, 5, 5, 5]. The number of quantized tokens N is set to 100. Discrete Diffusion Process. For the occlude and replace transition matrix, we linearly increase βs and γs from 0 to 0.1 and 0.9, respectively, and decrease αs from 1 to 0. For the discrete diffusion model, we use off-the-shelf image encoder [79] to extract feature sequence of conditional 2D image. As for the pose denoiser, we build a 21-layer 16-head transformer with the dimension of 1024. We set steps S as 100 and loss weight λ is set to 5e-4. |