Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data

Authors: Seunggeun Chi, Pin-Hao Huang, Enna Sachdeva, Hengbo Ma, Karthik Ramani, Kwonjoon Lee

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The effectiveness of our method was rigorously tested and validated through comprehensive experiments conducted on various HMD setup with AMASS and Ego-Exo4D datasets.Extensive experiments have proved our model s versatility and accurate pose estimation capabilities in various settings.4 Experiments
Researcher Affiliation Collaboration Seunggeun Chi Purdue University chi65@purdue.edu Pin-Hao Huang , Enna Sachdeva Honda Research Institute USA {pin-hao_huang, enna_sachdeva}@honda-ri.com Hengbo Ma Honda Research Institute USA hengbo.academia@gmail.com Karthik Ramani Purdue University ramani@purdue.edu Kwonjoon Lee Honda Research Institute USA kwonjoon_lee@honda-ri.com
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No Code release is challenging due to our organization s policy.
Open Datasets Yes We use the Ego-Exo4D dataset [9] https://ego-exo4d-data.org, which is licensed under a custom (commercial or non-commercial) license. We also use AMASS [19] https://amass.is.tue.mpg.de, which is licensed under a custom (non-commercial scientific research) license.
Dataset Splits Yes Specifically for the egopose task, it includes separate training and validation video sets containing 334 and 83 videos respectively.
Hardware Specification Yes We ran our experiments on one workstation, containing AMD Ryzen Threadripper PRO 7975WX, DDR5 RAM 256GB and 4 NVIDIA Ge Force RTX 4090.
Software Dependencies No The paper mentions software components like VQ-VAE, VQ-Diffusion, and Masked Auto-Encoder, and refers to third-party code like RTM-pose and Frank Mo Cap, but it does not provide specific version numbers for these software dependencies as required for reproducibility (e.g., 'PyTorch 1.9' or 'CPLEX 12.4').
Experiment Setup Yes VQ-VAE: We adhered to the architectural details and training protocol of Zhang et al. [35], with modifications including setting both the encoder and decoder stride to 1, and adjusting the window size to 40. For the Ego-Exo4D dataset, we employed wing loss with a width of 5 and a curvature of 4. For AMASS, we opted for L2 loss. Additionally, to generate smooth motion, we applied both velocity and acceleration losses, assigning weights of 10 for each in the AMASS dataset, and weights of 1 for each in the Ego-Exo4D dataset. ... We use Tw = 40 for both AMASS and Ego-Exo4D. ... Masked Auto-Encoder ... We trained 4 models to measure the uncertainty, M = 4. ... we set β to 0.5 for training the MAE.