DDF-HO: Hand-Held Object Reconstruction via Conditional Directed Distance Field

Authors: Chenyangguang Zhang, Yan Di, Ruida Zhang, Guangyao Zhai, Fabian Manhardt, Federico Tombari, Xiangyang Ji

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on synthetic and real-world datasets demonstrate that DDF-HO consistently outperforms all baseline methods by a large margin, especially under Chamfer Distance, with about 80% leap forward. Codes are available at https://github.com/ZhangCYG/DDFHO.
Researcher Affiliation Collaboration Chenyangguang Zhang1 , Yan Di2 , Ruida Zhang1 , Guangyao Zhai2, Fabian Manhardt3, Federico Tombari2,3, Xiangyang Ji1 1Tsinghua University, 2Technical University of Munich, 3 Google
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Codes are available at https://github.com/ZhangCYG/DDFHO.
Open Datasets Yes A synthetic dataset Ob Man [29] and two real-world datasets HO3D(v2) [28], MOW [6] are utilized to evaluate DDF-HO in various scenarios. Ob Man consists of 2772 objects of 8 categories from Shape Net [8], with 21K grasps generated by Grasp It [43]. HO3D(v2) [28] contains 77,558 images from 68 sequences with 10 different persons manipulating 10 different YCB objects [5]. MOW [6] comprises a total of 442 images and 121 object templates, collected from in-the-wild hand-object interaction datasets [14, 56].
Dataset Splits Yes We follow [29, 67] to split the training and testing sets. HO3D(v2) [28]... We follow [28] to split training and testing sets. MOW [6]... The training and testing splits remain the same as the released code of [67].
Hardware Specification Yes We conduct the training, evaluation and visualization of DDF-HO on a single A100 40GB GPU.
Software Dependencies No The paper mentions 'Trimesh 2' but does not provide specific version numbers for other key software components, libraries, or programming languages used (e.g., Python, PyTorch, CUDA) to reproduce the experiment.
Experiment Setup Yes The number of sampled points Kl along the projected 2D ray is set to 8 and number of multi-head attention is 2 for 2D Ray-Based Feature Aggregation technique. K3D for FL 3D introduced in Sec. 3.4 is set as 8. DDF-HO is trained end-to-end using Adam with a learning rate of 1e-4 on Ob Man for 100 epochs. Following [67], we use the network weights learned on synthetic Ob Man to initialize the training on HO3D(v2) and MOW. Training on HO3D(v2) and MOW also use Adam optimizer with a learning rate 1e-5 for another 100 and 10 epochs, respectively following [67]. The weighting factors of the loss for DDF-HO λ1, λ2 are set to 5.0 and 0.5, respectively.