Non-local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation

Authors: Jogendra Nath Kundu, Siddharth Seth, Anirudh Jamkhandi, Pradyumna YM, Varun Jampani, Anirban Chakraborty, Venkatesh Babu R

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
Researcher Affiliation Collaboration 1Indian Institute of Science, Bangalore 2Google Research
Pseudocode Yes Algorithm 1: Overview of the optimization steps.
Open Source Code Yes Webpage: https://sites.google.com/view/sa3dhp
Open Datasets Yes We use the CMU-Mo Cap [1] dataset as the sample set for unpaired 3D poses Y and unpaired pose sequences e Y. We use the synthetic SURREAL (S) dataset [79] as one of the source datasets. ... For a fair evaluation, we use the standard, in-studio Human3.6M (H) dataset [25] as both source or target domain, in different problem settings.
Dataset Splits No The paper mentions using different datasets for source and target domains (e.g., Human3.6M for source, MPI-INF-3DHP for target) and discusses 'unsupervised adaptation' and 'direct transfer' settings. However, it does not explicitly provide training/validation/test dataset splits with percentages, counts, or references to predefined splits for reproducibility within a single dataset.
Hardware Specification Yes The adaptation is performed on an Nvidia V100 GPU with each batch containing 8 videos each of frame-length 30 (see Suppl. for more details).
Software Dependencies No The paper mentions the use of 'Res Net-50', 'bidirectional LSTMs', and 'Adam optimizer' but does not provide specific version numbers for these components or the underlying deep learning framework used (e.g., TensorFlow, PyTorch).
Experiment Setup Yes The adaptation is performed on an Nvidia V100 GPU with each batch containing 8 videos each of frame-length 30 (see Suppl. for more details). ... We associate separate Adam optimizer [32] to each relational energy term which are optimized in alternate training iterations. ... The motion auto-encoder, {Em, Dm} is composed of bidirectional LSTMs [22] with 128 hidden units operating on a fixed sequence length of 30 (30 FPS)...