Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

Authors: Rui Zhao, Xudong Sun, Volker Tresp

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For evaluation of this framework, we combine it with Deep Deterministic Policy Gradient, both with or without Hindsight Experience Replay. On a set of multi-goal robotic tasks of Open AI Gym, we compare our method with other baselines and show promising improvements in both performance and sample-efficiency.
Researcher Affiliation Collaboration 1Faculty of Mathematics, Informatics and Statistics, Ludwig Maximilian University of Munich, Munich, Bavaria, Germany 2Siemens AG, Munich, Bavaria, Germany.
Pseudocode Yes Algorithm 1 Maximum Entropy-based Prioritization (MEP)
Open Source Code Yes Our code is available online at https://github.com/ruizhaogit/mep.git.
Open Datasets Yes We consider multi-goal reinforcement learning tasks, like the robotic simulation scenarios provided by Open AI Gym (Plappert et al., 2018)
Dataset Splits No The paper mentions training and testing, but does not provide specific details on training, validation, and test dataset splits or a validation set. For example: 'The learning curve with respect to training epochs is shown in Figure 3.'
Hardware Specification No The paper mentions 'we use 19 CPUs' but does not specify the model, type, or other detailed hardware specifications (like GPU, RAM, or specific processor speeds) required for reproducibility.
Software Dependencies No The paper states: 'The implementation uses Open AI Baselines (Dhariwal et al., 2017) with a backend of Tensor Flow (Abadi et al., 2016)', but it does not specify the version numbers for Open AI Baselines or TensorFlow.
Experiment Setup No The paper mentions 'train the agent for 200 epochs' as a general training duration, but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, optimizer settings), model initialization, or network architectures in the main text.