Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction

Authors: Tong He, John Collomosse, Hailin Jin, Stefano Soatto

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Geo-PIFu on a recent human mesh public dataset that is 10 larger than the private commercial dataset used in PIFu and previous derivative work. On average, we exceed the state of the art by 42.7% reduction in Chamfer and Point-to-Surface Distances, and 19.4% reduction in normal estimation errors.
Researcher Affiliation Collaboration 1UCLA. 2Creative Intelligence Lab, Adobe Research. 3CVSSP, University of Surrey, UK.
Pseudocode No The paper describes the model architecture and training process in text and figures but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper states that 'More qualitative results and network architecture details can be found in supplementary' but does not explicitly state that the source code for the methodology is released or provide a link.
Open Datasets Yes Dataset We use the Deep Human dataset [38]. It contains 5436 training and 1359 test human meshes of various clothes and poses... More importantly, our dataset is public and the meshes are reconstructed from cheap RGB-D sensors...
Dataset Splits No Dataset We use the Deep Human dataset [38]. It contains 5436 training and 1359 test human meshes of various clothes and poses.
Hardware Specification Yes We stop the 3D decoder training after 30 epochs and the implicit function training after 45 epochs using AWS V100-6GPU.
Software Dependencies No We use Py Torch [24] with RMSprop optimizer [32] and learning rate 1e 3 for both losses.
Experiment Setup Yes We use Py Torch [24] with RMSprop optimizer [32] and learning rate 1e 3 for both losses. We stop the 3D decoder training after 30 epochs and the implicit function training after 45 epochs using AWS V100-6GPU. The learning rate is reduced by a factor of 10 at the 8th, 23th and 40th epochs. We use a batch of 30 and 36 single-view images, respectively. For each single-view image, we sample Q = 5000 query points to compute Lquery. The step length d for multi-scale tri-linear interpolation is 0.0722 and we stop collecting features at 2 length. The balancing weight γ of Lgeo is 0.7.