reproducibilityindex.ai

Deep Reinforcement Learning for Active Human Pose Estimation

Authors: Erik Gärtner, Aleksis Pirinen, Cristian Sminchisescu10835-10844

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our model using single- and multi-target estimators with strong result in both settings. Our system further learns automatic stopping conditions in time and transition functions to the next temporal processing step in videos. In extensive experiments with the Panoptic multi-view setup, and for complex scenes containing multiple people, we show that our model learns to select viewpoints that yield significantly more accurate pose estimates compared to strong multi-view baselines.
Researcher Affiliation	Collaboration	Erik G artner,1 Aleksis Pirinen,1 Cristian Sminchisescu1,2,3 1Department of Mathematics, Faculty of Engineering, Lund University 2Institute of Mathematics of the Romanian Academy 3Google Research {erik.gartner, aleksis.pirinen, cristian.sminchisescu}@math.lth.se
Pseudocode	No	The paper does not contain pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper does not provide a concrete statement about releasing source code or a link to a code repository.
Open Datasets	Yes	We use 5 active-sequences, each consisting of length 10, to approximate the policy gradient, and update the policy parameters using Adam (Kingma and Ba 2015). As standard, to reduce variance we normalize cumulative rewards for each episode to zero mean and unit variance over the batch. The maximum trajectory length is set to 8 views including the initial one (10 in the multi-target mode, as it may require more views to reconstruct all people). The viewpoint selection and continue actions are trained jointly for 80k episodes. The learning rate is initially set to 5e-7 and is halved at 720k and 1440k agent steps. We linearly increase the precision parameters ma and me of the von Mises distributions from (1, 10) to (25, 50) in training, making the viewpoint selection increasingly focused on high-rewarding regions as training proceeds.
Dataset Splits	Yes	The scenes are randomly split into training, validation and test sets with 10, 4 and 6 scenes, respectively.
Hardware Specification	No	The paper mentions runtimes for DMHS-based systems but does not specify any particular hardware (GPU/CPU models, etc.) used for running the experiments.
Software Dependencies	No	The paper mentions using Faster R-CNN, DMHS, Muby Net, and Adam optimizer but does not specify version numbers for these software components or any other libraries.
Experiment Setup	Yes	We use 5 active-sequences, each consisting of length 10, to approximate the policy gradient, and update the policy parameters using Adam (Kingma and Ba 2015). As standard, to reduce variance we normalize cumulative rewards for each episode to zero mean and unit variance over the batch. The maximum trajectory length is set to 8 views including the initial one (10 in the multi-target mode, as it may require more views to reconstruct all people). The viewpoint selection and continue actions are trained jointly for 80k episodes. The learning rate is initially set to 5e-7 and is halved at 720k and 1440k agent steps. We linearly increase the precision parameters ma and me of the von Mises distributions from (1, 10) to (25, 50) in training, making the viewpoint selection increasingly focused on high-rewarding regions as training proceeds. We use median averaging for fusing poses, cf. (2). ... The improvement threshold τ, which is 0.07 for DMHS and 0.04 for Muby Net.