Model-based Reinforcement Learning for Continuous Control with Posterior Sampling

Authors: Ying Fan, Yifei Ming

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show that our algorithm achieves the state-of-the-art sample efficiency in benchmark continuous control tasks compared to prior model-based algorithms, and matches the asymptotic performance of model-free algorithms.
Researcher Affiliation Academia 1University of Wisconsin-Madison. Correspondence to: Ying Fan <yfan87@wisc.edu>, Yifei Ming <ming5@wisc.edu>.
Pseudocode Yes Algorithm 1 MPC-PSRL
Open Source Code No The paper does not provide an explicit statement or link for open-source code related to the described methodology.
Open Datasets No The paper mentions standard benchmark environments/tasks like "continuous Cartpole", "Pendulum Swing Up", "7-DOF Reacher", and "7-DOF pusher". However, it does not provide concrete access information (links, DOIs, repositories, or citations to specific dataset files) for publicly available datasets, but rather refers to simulation environments where data is generated dynamically.
Dataset Splits No The paper does not provide specific details on dataset splits (e.g., exact percentages or sample counts) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependency details (e.g., library names with version numbers) needed to replicate the experiments.
Experiment Setup No The learning curves of all compared algorithms are shown in Figure 2, and the hyperparameters and other experimental settings in our experiments are provided in Appendix.