Multiview Human Body Reconstruction from Uncalibrated Cameras

Authors: Zhixuan Yu, Linguang Zhang, Yuanlu Xu, Chengcheng Tang, LUAN TRAN, Cem Keskin, Hyun Soo Park

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our calibration-free multiview fusion approach on multiple datasets varying from indoor to outdoor, controlled to in-the-wild environments. We refer the reader to check additional results, experiments and implementation details in the supplementary material.
Researcher Affiliation Collaboration Zhixuan Yu University of Minnesota yu000064@umn.edu Linguang Zhang Meta Reality Labs linguang@meta.com Yuanlu Xu Meta Reality Labs Research yuanluxu@meta.com Chengcheng Tang Meta Reality Labs chengcheng.tang@meta.com Luan Tran Meta Reality Labs tranluan07@meta.com Cem Keskin Meta Reality Labs cemkeskin@meta.com Hyun Soo Park University of Minnesota hspark@umn.edu
Pseudocode No The paper describes the method using figures and equations but does not provide pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Human3.6M1 [15] is a large-scale multiview dataset with ground truth 3D human pose annotation... UP-3D [26] is an in-the-wild single view dataset... MARCOn I [6] is a multiview dataset... VBR [2] is a multiview dataset...
Dataset Splits No We follow the standard training/testing split: using subject S1, S5, S6, S7 and S8 for training, and subject S9 and S11 for testing. We use the standard training split [24, 52]. The paper does not explicitly mention a 'validation' split or provide specific percentages/counts for the splits beyond subject IDs.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running experiments.
Software Dependencies No The paper mentions using a 'Res Net-50 backbone' and a 'pre-trained Dense Pose model' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We design the encoder f E( ; θE) as a Res Net-50 backbone [12], that takes a 224 224 3 image as an input and outputs a global feature vector with 256 dimensions and a 56 56 local feature map with 256 dimensions... We set τ = 0.05 and σ = 2.33 10 2.