Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation
Authors: István Sárándi, Gerard Pons-Moll
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively evaluate our method on a variety of benchmarks: 3DPW [114] and EMDB [47] for SMPL body, AGORA [85] and EHF [87] for SMPL-X, SSP-3D [96] for SMPL focusing on body shape, as well as Human3.6M [40], MPI-INF-3DHP [70] and Mu Po TS-3D [72] for 3D skeletons. |
| Researcher Affiliation | Academia | István Sárándi,1,2 Gerard Pons-Moll1,2,3 1University of Tübingen, Germany, 2Tübingen AI Center, Germany, 3Max Planck Institute for Informatics, Saarland Informatics Campus, Germany |
| Pseudocode | Yes | In Algorithm 1, we provide the simplified pseudocode for our body model fitting algorithm used in the main paper. |
| Open Source Code | Yes | We will make our code and trained models publicly available for research. |
| Open Datasets | Yes | We extensively evaluate our method on a variety of benchmarks: 3DPW [114] and EMDB [47] for SMPL body, AGORA [85] and EHF [87] for SMPL-X, SSP-3D [96] for SMPL focusing on body shape, as well as Human3.6M [40], MPI-INF-3DHP [70] and Mu Po TS-3D [72] for 3D skeletons. |
| Dataset Splits | No | The paper mentions using test sets from various benchmarks (e.g., 3DPW, SSP-3D, AGORA) for evaluation, but it does not specify explicit training/validation splits (e.g., percentages or counts for a separate validation set) for its combined meta-dataset used during training. It describes mixed-batch training on a combination of datasets but not a dedicated validation split. |
| Hardware Specification | Yes | Training the S model takes 2 days on two 40 GB A100 GPUs, while the L takes 4 days on 8 A100s. NLF-S has a batched throughput of 410 fps and unbatched throughput of 79 fps on an Nvidia RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions several software components like Efficient Net V2-S and L [106], Adam W [66], YOLOv8 [42], Blender, and SMPLitex [14]. However, it does not provide specific version numbers for these software components, which is necessary for full reproducibility of the environment. |
| Experiment Setup | Yes | We use Efficient Net V2-S (256 px) and L (384 px) [106] initialized from [93], and train with Adam W [66], linear warmup and exponential learning rate decay for 300k steps. Training the S model takes 2 days on two 40 GB A100 GPUs, while the L takes 4 days on 8 A100s. We use random rotation, scaling, translation, truncation, color distortion, synthetic occlusion, random erasing and JPEG compression for data augmentation during training. |