Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction
Authors: Zhenyu Lou, Qiongjie Cui, Tuo Wang, Zhenbo Song, Luoming Zhang, Cheng Cheng, Haofan Wang, Xu Tang, Huaxia Li, Hong Zhou
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On two real-captured benchmarks, Di Mo P3D has demonstrated significant improvements over state-of-the-art methods, showcasing its effectiveness in generating diverse and physically consistent motion predictions within real-world 3D environments. |
| Researcher Affiliation | Collaboration | Zhenyu Lou1 Qiongjie Cui2 Tuo Wang4 Zhenbo Song2 Luoming Zhang1 Cheng Cheng5 Haofan Wang3 Xu Tang3 Huaxia Li3 Hong Zhou1 1Zhejiang University, 2Nanjing University of Science and Technology, 3Xiaohongshu Inc, 4University of Texas at Austin, 5Concordia University |
| Pseudocode | Yes | Algorithm 1 get_heightmap(S): |
| Open Source Code | Yes | More details and the video demo are available at the webpage https://sites.google.com/view/dimop3d. Justification: Appendix A tells the training details and the supplemental material provides our source code. |
| Open Datasets | Yes | Dataset-1: GIMO [88], which records motion sequences represented by full-body SMPL-X poses with 129K frames. It consists of 14 scenes with 3D point clouds, each scene is captured by a 3D Li DAR sensor, containing 10-20 objects with 500K vertices. Dataset-2: CIRCLE [4] comprises 10 hours of high-fidelity full-body motion sequences from 5 subjects across nine apartment scenes. |
| Dataset Splits | No | For a fair comparison, we follow the official split to divide the dataset into training and testing sets according to the scenes. |
| Hardware Specification | Yes | All training is conducted on a single NVIDIA RTX3090 GPU, with the complete pipeline converging in 8 hours. |
| Software Dependencies | No | The paper mentions software components like 'Scan Net pretrained Soft Group model' but does not provide specific version numbers for software dependencies such as Python, PyTorch, or CUDA in the experimental setup or training details section (Appendix A). |
| Experiment Setup | Yes | We set L = 3-sec and L = 5-sec to achieve a long-term prediction [40, 49]. Hyperparameters λcont = 3, λdist = 10, λobj = 1 are adjusted to maintain a balance among the factors. We set σ1 = 0.3 and σ2 = σ3 = 1.0 for balance. |