Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction
Authors: Tong He, John Collomosse, Hailin Jin, Stefano Soatto
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Geo-PIFu on a recent human mesh public dataset that is 10 larger than the private commercial dataset used in PIFu and previous derivative work. On average, we exceed the state of the art by 42.7% reduction in Chamfer and Point-to-Surface Distances, and 19.4% reduction in normal estimation errors. |
| Researcher Affiliation | Collaboration | 1UCLA. 2Creative Intelligence Lab, Adobe Research. 3CVSSP, University of Surrey, UK. |
| Pseudocode | No | The paper describes the model architecture and training process in text and figures but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states that 'More qualitative results and network architecture details can be found in supplementary' but does not explicitly state that the source code for the methodology is released or provide a link. |
| Open Datasets | Yes | Dataset We use the Deep Human dataset [38]. It contains 5436 training and 1359 test human meshes of various clothes and poses... More importantly, our dataset is public and the meshes are reconstructed from cheap RGB-D sensors... |
| Dataset Splits | No | Dataset We use the Deep Human dataset [38]. It contains 5436 training and 1359 test human meshes of various clothes and poses. |
| Hardware Specification | Yes | We stop the 3D decoder training after 30 epochs and the implicit function training after 45 epochs using AWS V100-6GPU. |
| Software Dependencies | No | We use Py Torch [24] with RMSprop optimizer [32] and learning rate 1e 3 for both losses. |
| Experiment Setup | Yes | We use Py Torch [24] with RMSprop optimizer [32] and learning rate 1e 3 for both losses. We stop the 3D decoder training after 30 epochs and the implicit function training after 45 epochs using AWS V100-6GPU. The learning rate is reduced by a factor of 10 at the 8th, 23th and 40th epochs. We use a batch of 30 and 36 single-view images, respectively. For each single-view image, we sample Q = 5000 query points to compute Lquery. The step length d for multi-scale tri-linear interpolation is 0.0722 and we stop collecting features at 2 length. The balancing weight γ of Lgeo is 0.7. |