Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Authors: Ying Fan, Yifei Ming
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that our algorithm achieves the state-of-the-art sample efficiency in benchmark continuous control tasks compared to prior model-based algorithms, and matches the asymptotic performance of model-free algorithms. |
| Researcher Affiliation | Academia | 1University of Wisconsin-Madison. Correspondence to: Ying Fan <yfan87@wisc.edu>, Yifei Ming <ming5@wisc.edu>. |
| Pseudocode | Yes | Algorithm 1 MPC-PSRL |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code related to the described methodology. |
| Open Datasets | No | The paper mentions standard benchmark environments/tasks like "continuous Cartpole", "Pendulum Swing Up", "7-DOF Reacher", and "7-DOF pusher". However, it does not provide concrete access information (links, DOIs, repositories, or citations to specific dataset files) for publicly available datasets, but rather refers to simulation environments where data is generated dynamically. |
| Dataset Splits | No | The paper does not provide specific details on dataset splits (e.g., exact percentages or sample counts) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details (e.g., library names with version numbers) needed to replicate the experiments. |
| Experiment Setup | No | The learning curves of all compared algorithms are shown in Figure 2, and the hyperparameters and other experimental settings in our experiments are provided in Appendix. |