Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation
Authors: Bohan Zhou, Yi Zhan, Zhongbin Zhang, Zongqing Lu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across five in-domain and two cross-domain datasets demonstrate the effectiveness of MEgo Hand, achieving substantial reductions in wrist translation error (86.9%) and joint rotation error (34.1%), highlighting its capacity to accurately model fine-grained hand joint structures and generalize robustly across diverse scenarios. |
| Researcher Affiliation | Collaboration | 1 School of Computer Science, Peking University 2 Department of Automation, Tsinghua University 3 Being Beyond |
| Pseudocode | No | The paper describes its methodology through prose and mathematical formulations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | All codes and documents are included in the supplementary materials. |
| Open Datasets | Yes | We utilize a variety of publicly available egocentric hand-object interaction datasets in our experiments. Below is a brief description of each dataset along with its official website for reference: H2O: https://taeinkwon.com/projects/h2o/ HOI4D: https://hoi4d.github.io/ HOT3D: https://facebookresearch.github.io/hot3d/ OAKINK2: https://oakink.net/v2/ TACA: https://taco2024.github.io/ ARCTIC: https://arctic.is.tue.mpg.de/ HOLO: https://holoassist.github.io/#Holo Assist |
| Dataset Splits | Yes | We include 6 training datasets: TACO, FPHA, HOI4D, H2O, HOT3D, and Oak Ink2... For the other five datasets, we hold out 10% of the data from each as in-domain evaluation sets... we evaluate on two cross-domain test sets: full ARCTIC dataset and a 10% partition of the HOLO dataset. |
| Hardware Specification | Yes | MEgo Hand is trained using 8 80GB NVIDIA A800 GPUs over approximately 24 hours. All evaluations and visualizations are performed on a single 80GB A800 GPU for around three hours. We evaluated the end-to-end inference performance for generating a 16-frame sequence on a single RTX 4090 GPU. |
| Software Dependencies | No | The paper mentions using 'Optimizer Adam W' and various pre-trained models like 'Eagle-2', 'Smol LM2', 'Sig LIP-2', and 'Uni Depth V2', but does not provide specific version numbers for software libraries or frameworks (e.g., Python, PyTorch, CUDA) required for replication. |
| Experiment Setup | Yes | Appendix A.1 provides a table titled 'Hyperparameters of MEgo Hand Training' which lists specific values for Prediction Trunk Size l (16), Integration Step Size δ (0.1), Gradient steps (50,000), Batch size (64), Learning Rate (3e-4), Optimizer (Adam W), Adam β1 (0.95), Adam β2 (0.999), Adam ϵ (1e-8), LR scheduler (cosine), Weight Decay (1e-5), Warmup Ratio (0.05), and mentions the frozen/unfrozen status of VLM components and Di T. |