Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data
Authors: Seunggeun Chi, Pin-Hao Huang, Enna Sachdeva, Hengbo Ma, Karthik Ramani, Kwonjoon Lee
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness of our method was rigorously tested and validated through comprehensive experiments conducted on various HMD setup with AMASS and Ego-Exo4D datasets.Extensive experiments have proved our model s versatility and accurate pose estimation capabilities in various settings.4 Experiments |
| Researcher Affiliation | Collaboration | Seunggeun Chi Purdue University EMAIL Pin-Hao Huang , Enna Sachdeva Honda Research Institute USA EMAIL Hengbo Ma Honda Research Institute USA EMAIL Karthik Ramani Purdue University EMAIL Kwonjoon Lee Honda Research Institute USA EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Code release is challenging due to our organization s policy. |
| Open Datasets | Yes | We use the Ego-Exo4D dataset [9] https://ego-exo4d-data.org, which is licensed under a custom (commercial or non-commercial) license. We also use AMASS [19] https://amass.is.tue.mpg.de, which is licensed under a custom (non-commercial scientific research) license. |
| Dataset Splits | Yes | Specifically for the egopose task, it includes separate training and validation video sets containing 334 and 83 videos respectively. |
| Hardware Specification | Yes | We ran our experiments on one workstation, containing AMD Ryzen Threadripper PRO 7975WX, DDR5 RAM 256GB and 4 NVIDIA Ge Force RTX 4090. |
| Software Dependencies | No | The paper mentions software components like VQ-VAE, VQ-Diffusion, and Masked Auto-Encoder, and refers to third-party code like RTM-pose and Frank Mo Cap, but it does not provide specific version numbers for these software dependencies as required for reproducibility (e.g., 'PyTorch 1.9' or 'CPLEX 12.4'). |
| Experiment Setup | Yes | VQ-VAE: We adhered to the architectural details and training protocol of Zhang et al. [35], with modifications including setting both the encoder and decoder stride to 1, and adjusting the window size to 40. For the Ego-Exo4D dataset, we employed wing loss with a width of 5 and a curvature of 4. For AMASS, we opted for L2 loss. Additionally, to generate smooth motion, we applied both velocity and acceleration losses, assigning weights of 10 for each in the AMASS dataset, and weights of 1 for each in the Ego-Exo4D dataset. ... We use Tw = 40 for both AMASS and Ego-Exo4D. ... Masked Auto-Encoder ... We trained 4 models to measure the uncertainty, M = 4. ... we set β to 0.5 for training the MAE. |