Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control
Authors: Jie Xu, Yunsheng Tian, Pingchuan Ma, Daniela Rus, Shinjiro Sueda, Wojciech Matusik
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that the proposed algorithm can efficiently find a significantly higher-quality set of Pareto-optimal policies than existing methods. |
| Researcher Affiliation | Academia | 1Computer Science & Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology 2Texas A&M University. |
| Pseudocode | Yes | Algorithm 1 Prediction-Guided MORL Algorithm |
| Open Source Code | Yes | The code can be found at https://github.com/mitgfx/PGMORL |
| Open Datasets | No | The paper designs 'seven multi-objective RL environments with continuous action space based on Mujoco', described in Appendix C, but it does not provide concrete access information (link, DOI, or formal citation for a public dataset repository) for these specific environments as a dataset. |
| Dataset Splits | No | The paper describes reinforcement learning stages (Warm-up Stage, Evolutionary Stage) and training processes for policies, but it does not specify traditional dataset splits (e.g., percentages or counts for training, validation, and testing data) as would be typical for supervised learning tasks. |
| Hardware Specification | No | The paper mentions evaluating performance using a 'physics-based simulation system (Todorov et al., 2012)' but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components such as Mujoco, PPO, t-SNE, and k-means, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | More details about the experiment setup are described in Appendix D.1. The training details and parameters are reported in Appendix D.2. |