reproducibilityindex.ai

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Authors: Yuda Song, Wen Sun

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, we ﬁrst demonstrate the ﬂexibility and the efﬁcacy of our algorithm on a set of exploration challenging control tasks where existing empirical model-based RL approaches completely fail. We then show that our approach retains excellent performance even in common dense reward control benchmarks that do not require heavy exploration.
Researcher Affiliation	Academia	1Machine Learning Department, Carnegie Mellon University, Pittsburgh, USA 2Department of Computer Science, Cornell University, Ithaca , USA.
Pseudocode	Yes	Algorithm 1 The PC-MLP Framework Algorithm 2 Deep PC-MLP
Open Source Code	No	The paper does not include any explicit statement about releasing the source code for their proposed method, nor does it provide a link to a code repository.
Open Datasets	Yes	We test Deep PC-MLP in 10 Mujoco (Todorov et al., 2012) locomotion and navigation environments.
Dataset Splits	No	The paper mentions training with "200k real-world samples" and using "4 random seeds", but it does not specify any dataset splits for training, validation, or testing, nor does it describe a cross-validation setup.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments, such as CPU or GPU models, memory, or cloud instance specifications.
Software Dependencies	No	The paper mentions software components like Open AI Gym, Mujoco, TRPO, and MPPI, but it does not provide specific version numbers for any of these or other software dependencies.
Experiment Setup	Yes	We include all experiments and hyperparameter details in Appendix D.