reproducibilityindex.ai

Non-local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation

Authors: Jogendra Nath Kundu, Siddharth Seth, Anirudh Jamkhandi, Pradyumna YM, Varun Jampani, Anirban Chakraborty, Venkatesh Babu R

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
Researcher Affiliation	Collaboration	1Indian Institute of Science, Bangalore 2Google Research
Pseudocode	Yes	Algorithm 1: Overview of the optimization steps.
Open Source Code	Yes	Webpage: https://sites.google.com/view/sa3dhp
Open Datasets	Yes	We use the CMU-Mo Cap [1] dataset as the sample set for unpaired 3D poses Y and unpaired pose sequences e Y. We use the synthetic SURREAL (S) dataset [79] as one of the source datasets. ... For a fair evaluation, we use the standard, in-studio Human3.6M (H) dataset [25] as both source or target domain, in different problem settings.
Dataset Splits	No	The paper mentions using different datasets for source and target domains (e.g., Human3.6M for source, MPI-INF-3DHP for target) and discusses 'unsupervised adaptation' and 'direct transfer' settings. However, it does not explicitly provide training/validation/test dataset splits with percentages, counts, or references to predefined splits for reproducibility within a single dataset.
Hardware Specification	Yes	The adaptation is performed on an Nvidia V100 GPU with each batch containing 8 videos each of frame-length 30 (see Suppl. for more details).
Software Dependencies	No	The paper mentions the use of 'Res Net-50', 'bidirectional LSTMs', and 'Adam optimizer' but does not provide specific version numbers for these components or the underlying deep learning framework used (e.g., TensorFlow, PyTorch).
Experiment Setup	Yes	The adaptation is performed on an Nvidia V100 GPU with each batch containing 8 videos each of frame-length 30 (see Suppl. for more details). ... We associate separate Adam optimizer [32] to each relational energy term which are optimized in alternate training iterations. ... The motion auto-encoder, {Em, Dm} is composed of bidirectional LSTMs [22] with 128 hidden units operating on a ﬁxed sequence length of 30 (30 FPS)...