Optimal-state Dynamics Estimation for Physics-based Human Motion Capture from Videos
Authors: Cuong Le, John Viktor Johansson, Manon Kok, Bastian Wandt
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on two human motion benchmark datasets. The first and main dataset is the popular Human3.6M [15]. ... We report the quantitative results of OSDCap and other related work on different metrics in Tab. 1. ... We conduct an ablation study to verify the impact of the optimal-state estimation process on simulated motions. |
| Researcher Affiliation | Academia | Cuong Le1, Viktor Johansson1, Manon Kok2 and Bastian Wandt1; 1Department of Electrical Engineering, Linköping University, Sweden; 2Delft Center for Systems and Control, Delft University of Technology, The Netherlands |
| Pseudocode | No | The paper describes the approach using text, mathematical equations, and flow diagrams (Figure 2 and Figure 4), but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | The code is available on . (Abstract); The paper will provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results in the final version. (Question 5, NeurIPS checklist) |
| Open Datasets | Yes | We evaluate our approach on two human motion benchmark datasets. The first and main dataset is the popular Human3.6M [15]. ... The second database is Fit3D [7]... Since the scene setting from Human3.6M and Fit3D are very similar, we perform an additional evaluation on the new dataset Sports Pose [14]. |
| Dataset Splits | Yes | Following previous work [38, 21], the first five subjects (S1, S5, S6, S7, S8) are used for training, and the last two (S9, S11) for evaluation. ... We split the data by taking samples from the 6 actors (s03, s04, s05, s07, s08, s10) for training, and 2 actors (s09, s11) for evaluation... For Sports Pose, we only consider sequences that contain human at time step 0: (S02, S03, S05, S06, S07, S08, S09) for fine-tuning and (S12, S13, S14) for evaluation. |
| Hardware Specification | Yes | The proposed pipeline of OSDCap was trained and evaluated on the NVIDIA-A100 GPU with 40Gb of memory. |
| Software Dependencies | No | The paper mentions software like RBDL [6], Py Bullet [3], TRACE [40], and common functions like Leaky ReLU and Layernorm. However, it does not specify explicit version numbers for any of these software dependencies. |
| Experiment Setup | Yes | The initial motion observation is generated by TRACE [40]. As suggested by [38, 8], all extracted motions are down-sampled from 100Hz to 50Hz. The samples are aligned to the world origin in the first frame, then split into 100-frame sub-sequences to utilize batch training and evaluation. ... OSDNet is trained for 15 epochs with a base learning rate of 5e-4 and a batch size of 64. The learning rates from all training processes are scheduled to reduce by a factor of 10 at epochs 10 and 13. Leaky ReLU and Layernorm are used as the activation function and normalization for each linear layer of every module. We also apply a training warm-up strategy on the first 5 epochs by increasing the learning rate by factor of 2 to the base learning rate at epoch 5. |