Dense Keypoints via Multiview Supervision
Authors: Zhixuan Yu, Haozheng Yu, Long Sha, Sujoy Ganguly, Hyun Soo Park
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments We perform experiments on human and monkey targets as two example applications to evaluate the effectiveness of our proposed semi-supervised learning pipeline. |
| Researcher Affiliation | Collaboration | Zhixuan Yu University of Minnesota yu000064@umn.edu Haozheng Yu University of Minnesota yu000424@umn.edu Long Sha Tu Simple long.sha@tusimple.ai Sujoy Ganguly Unity sujoy.ganguly@unity3d.com Hyun Soo Park University of Minnesota hspark@umn.edu |
| Pseudocode | No | The paper describes its methods but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing source code for the described methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | Human3.6M [11] is a large-scale indoor multiview dataset... For the human dense keypoints, we use 48K human instances in Dense Pose-COCO [8] training set to train the initial model... 3DPW [47] is an in-the-wild dataset... Ski-Pose PTZ-Camera Dataset [40] is a multiview dataset... Open Monkey Pose [3] is a large landmark dataset... |
| Dataset Splits | Yes | Human3.6M [11]... Following common protocols, we use subject S1, S5, S6, S7 and S8 for training, and reserve subject S9 and S11 for testing." and "Ski-Pose PTZ-Camera Dataset [40]... It contains 8.5K training images and 1.7K testing images. We use its standard train/test split to train and evaluation our model. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. It only mentions the use of a neural network backbone and a deep learning framework. |
| Software Dependencies | No | The paper mentions software components like HRNet and PyTorch3D but does not specify their version numbers or any other software dependencies with version information required for replication. |
| Experiment Setup | Yes | We use HRNet [18] as the backbone network followed by four head networks made up of convolutional layers to predict foreground mask, body part index, and UV coordinates on the canonical body surface, respectively. Each network takes as an input a 224 224 image and outputs 15-channel (for foreground mask head only) or 25-channel 56 56 feature maps [8]. We train the network in two stages. |