reproducibilityindex.ai

Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba

Authors: Haoye Dong, Aviral Chharia, Wenbo Gou, Francisco Vicente Carrasco, Fernando D De la Torre

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on several benchmarks and in-the-wild tests demonstrate that Hamba significantly outperforms existing SOTAs, achieving the PA-MPVPE of 5.3mm and F@15mm of 0.992 on Frei HAND.
Researcher Affiliation	Academia	Haoye Dong , Aviral Chharia Wenbo Gou Francisco Vicente Carrasco Fernando De la Torre Carnegie Mellon University {haoyed, achharia, wgou, fvicente, ftorre}@andrew.cmu.edu
Pseudocode	Yes	Algorithm 1 Graph-guided State Space (GSS) block
Open Source Code	Yes	Our code was included in the Supplementary .zip file during the Neur IPS review. We will open-source it shortly with a detailed readme on the project s Github repository.
Open Datasets	Yes	We train Hamba on 2.7M training samples from multiple datasets (same setting as [70] for a fair comparison) that had either both 2D and 3D hand annotations or just 2D annotations. This included Frei HAND [111], HO3D [29], MTC [91], RHD [110], Inter Hand2.6M [64], H2O3D [29], Dex YCB [6], COCO-Wholebody [36], Halpe [21], and MPII NZSL [79] datasets.
Dataset Splits	No	The paper mentions 'Early stopping was used after 170k steps to prevent overfitting', implying the use of a validation set, but it does not specify the explicit split (e.g., percentages or counts) or how the validation data was partitioned from the training samples.
Hardware Specification	Yes	The Joints Regressor (JR) was trained on a single NVIDIA A4500 GPU... The complete Hamba model was trained on two NVIDIA A6000 GPUs... Hamba (Ours) 1 A100, 300K Steps
Software Dependencies	No	The paper mentions 'pytorch.nn.Functional.grid_sample module of Py Torch' but does not specify the version number for PyTorch or other software dependencies.
Experiment Setup	Yes	We set learning rate as 10 5, weight decay factor as 10 4, with the sum loss. Weights for each term in the loss function are λ3D = 0.05 for 3D keypoint loss, λ2D = 0.01 for 2D keypoint loss, λθ = 0.001 for global orientation and hand pose loss. Weights for beta and adversarial loss, i.e., λβ and λadv were set as 0.0005.