SiMA-Hand: Boosting 3D Hand-Mesh Reconstruction by Single-to-Multi-View Adaptation

Authors: Yinqiao Wang, Hao Xu, Pheng Ann Heng, Chi-Wing Fu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on the Dex-YCB and Han Co benchmarks with challenging objectand self-caused occlusion cases, manifesting that Si MA-Hand consistently achieves superior performance over the state of the arts.
Researcher Affiliation Academia Yinqiao Wang1,2, Hao Xu1,2, Pheng-Ann Heng1,2, Chi-Wing Fu1,2 1Department of Computer Science and Engineering, CUHK 2Institute of Medical Intelligence and XR, CUHK
Pseudocode No The paper describes its methods in detail with mathematical formulations and diagrams but does not include structured pseudocode or algorithm blocks.
Open Source Code No Code will be released on https://github.com/Joyboy Wang/Si MA-Hand Pytorch.
Open Datasets Yes We conduct experiments on Dex-YCB (Chao et al. 2021) and Han Co (Zimmermann, Argus, and Brox 2021).
Dataset Splits No For Dex-YCB, the paper states 'We adopt the default S0 train/test split with 406,888/78,768 samples for training/testing.' For Han Co, it describes training and testing sets, but a specific validation split is not explicitly mentioned.
Hardware Specification Yes We train Si MA-Hand on four NVidia Titan V GPUs, and the Adam optimizer (Kingma and Ba 2014) is adopted. The batch size in training is set to 64 for MVR-Hand and 128/32 for SVRHand. The FPS is tested on an NVidia RTX 2080Ti.
Software Dependencies No The paper mentions using the 'Adam optimizer' but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or CUDA versions).
Experiment Setup Yes We follow (Chen et al. 2022) to pre-train the feature encoder network and adopt a two-stage strategy as (Xu et al. 2023) to stabilize the training. We train Si MA-Hand on four NVidia Titan V GPUs, and the Adam optimizer (Kingma and Ba 2014) is adopted. The batch size in training is set to 64 for MVR-Hand and 128/32 for SVRHand. The input image is resized to 128 128 and augmented by random scaling, rotating, and color jittering. All N = 8 views are used for training the MVR-Hand.