reproducibilityindex.ai

Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

Authors: Rui Zhao, Xudong Sun, Volker Tresp

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For evaluation of this framework, we combine it with Deep Deterministic Policy Gradient, both with or without Hindsight Experience Replay. On a set of multi-goal robotic tasks of Open AI Gym, we compare our method with other baselines and show promising improvements in both performance and sample-efﬁciency.
Researcher Affiliation	Collaboration	1Faculty of Mathematics, Informatics and Statistics, Ludwig Maximilian University of Munich, Munich, Bavaria, Germany 2Siemens AG, Munich, Bavaria, Germany.
Pseudocode	Yes	Algorithm 1 Maximum Entropy-based Prioritization (MEP)
Open Source Code	Yes	Our code is available online at https://github.com/ruizhaogit/mep.git.
Open Datasets	Yes	We consider multi-goal reinforcement learning tasks, like the robotic simulation scenarios provided by Open AI Gym (Plappert et al., 2018)
Dataset Splits	No	The paper mentions training and testing, but does not provide specific details on training, validation, and test dataset splits or a validation set. For example: 'The learning curve with respect to training epochs is shown in Figure 3.'
Hardware Specification	No	The paper mentions 'we use 19 CPUs' but does not specify the model, type, or other detailed hardware specifications (like GPU, RAM, or specific processor speeds) required for reproducibility.
Software Dependencies	No	The paper states: 'The implementation uses Open AI Baselines (Dhariwal et al., 2017) with a backend of Tensor Flow (Abadi et al., 2016)', but it does not specify the version numbers for Open AI Baselines or TensorFlow.
Experiment Setup	No	The paper mentions 'train the agent for 200 epochs' as a general training duration, but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, optimizer settings), model initialization, or network architectures in the main text.