DHER: Hindsight Experience Replay for Dynamic Goals

Authors: Meng Fang, Cheng Zhou, Bei Shi, Boqing Gong, Jia Xu, Tong Zhang

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate DHER on tasks of robotic manipulation and moving object tracking, and transfer the polices from simulation to physical robots. Extensive comparison and ablation studies demonstrate the superiority of our approach, showing that DHER is a crucial ingredient to enable RL to solve tasks with dynamic goals in manipulation and grid world domains.
Researcher Affiliation Industry Meng Fang , Cheng Zhou, Bei Shi, Boqing Gong, Jia Xu, Tong Zhang Tencent AI Lab
Pseudocode Yes Algorithm 1 Dynamic Hindsight Experience Replay with Experience Assembling
Open Source Code Yes 1Our code and environments are available at https://github.com/mengf1/DHER.
Open Datasets No The paper describes modifying existing environments and creating new ones, but does not provide concrete access information (link, DOI, specific citation with author/year) for any publicly available dataset used for training.
Dataset Splits No The paper does not specify exact percentages, sample counts, or citations to predefined splits for training, validation, or test sets.
Hardware Specification Yes We use the Universal Robots UR10 with a gripper. We use a Real Sense Camera SR300 to track the position of objects.
Software Dependencies No The paper mentions 'Mu Jo Co physics engine' but does not provide a specific version number for it or any other software dependency.
Experiment Setup Yes Goals are positions in the 3D world coordinate system with a fixed tolerance (we use ϵ = 0.01 for the tolerance). The velocity we use is v = 0.011. Rewards are binary and sparse: r(st, at, gt) = 1condition(|sobj t+1 gt+1| ϵ)