reproducibilityindex.ai

Neural Dynamic Policies for End-to-End Sensorimotor Learning

Authors: Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate NDPs in imitation as well as reinforcement learning setups. NDPs can utilize highdimensional inputs via demonstrations and learn from weak supervisory signals as well as rewards. In both setups, NDPs exhibit better or comparable performance to state-of-the-art approaches.
Researcher Affiliation	Collaboration	Shikhar Bahl CMU Mustafa Mukadam FAIR Abhinav Gupta CMU Deepak Pathak CMU
Pseudocode	Yes	Algorithm 1 Training NDPs for RL
Open Source Code	Yes	Project video and code are available at: https://shikharbahl.github.io/ neural-dynamic-policies/.
Open Datasets	Yes	We took existing torque control based environments for Picking and Throwing [17] and modiﬁed them to enable joint angle control. [...] To test on quasi-static tasks, we use Pushing, Soccer, Faucet-Opening from the Meta-World [46] task suite
Dataset Splits	No	The paper mentions 'train' and 'test' splits, but does not explicitly detail a separate 'validation' split or how it was used.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions software like Mujoco [43] and PPO [38], but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We run comparisons on the pushing task, varying the number of basis functions N (in the set {2, 6, 10, 15, 20}), DMP rollout lengths (in set {3, 5, 7, 10, 15}), number of integration steps (in set {15, 25, 35, 45}), as well as different basis functions: Gaussian RBF (standard), ψ deﬁned in Equation (3), a liner map ψ(x) = x, a multiquadric map: ψ(x) = p 1 + (ϵx)2, a inverse quadric map ψ(x) = 1 1+(ϵx)2 , and an inverse multiquadric map: ψ(x) = 1