Non-local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation
Authors: Jogendra Nath Kundu, Siddharth Seth, Anirudh Jamkhandi, Pradyumna YM, Varun Jampani, Anirban Chakraborty, Venkatesh Babu R
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks. |
| Researcher Affiliation | Collaboration | 1Indian Institute of Science, Bangalore 2Google Research |
| Pseudocode | Yes | Algorithm 1: Overview of the optimization steps. |
| Open Source Code | Yes | Webpage: https://sites.google.com/view/sa3dhp |
| Open Datasets | Yes | We use the CMU-Mo Cap [1] dataset as the sample set for unpaired 3D poses Y and unpaired pose sequences e Y. We use the synthetic SURREAL (S) dataset [79] as one of the source datasets. ... For a fair evaluation, we use the standard, in-studio Human3.6M (H) dataset [25] as both source or target domain, in different problem settings. |
| Dataset Splits | No | The paper mentions using different datasets for source and target domains (e.g., Human3.6M for source, MPI-INF-3DHP for target) and discusses 'unsupervised adaptation' and 'direct transfer' settings. However, it does not explicitly provide training/validation/test dataset splits with percentages, counts, or references to predefined splits for reproducibility within a single dataset. |
| Hardware Specification | Yes | The adaptation is performed on an Nvidia V100 GPU with each batch containing 8 videos each of frame-length 30 (see Suppl. for more details). |
| Software Dependencies | No | The paper mentions the use of 'Res Net-50', 'bidirectional LSTMs', and 'Adam optimizer' but does not provide specific version numbers for these components or the underlying deep learning framework used (e.g., TensorFlow, PyTorch). |
| Experiment Setup | Yes | The adaptation is performed on an Nvidia V100 GPU with each batch containing 8 videos each of frame-length 30 (see Suppl. for more details). ... We associate separate Adam optimizer [32] to each relational energy term which are optimized in alternate training iterations. ... The motion auto-encoder, {Em, Dm} is composed of bidirectional LSTMs [22] with 128 hidden units operating on a fixed sequence length of 30 (30 FPS)... |