Model-Based Reinforcement Learning via Imagination with Derived Memory
Authors: Yao Mu, Yuzheng Zhuang, Bin Wang, Guangxiang Zhu, Wulong Liu, Jianyu Chen, Ping Luo, Shengbo Li, Chongjie Zhang, Jianye Hao
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various high-dimensional visual control tasks in the DMControl benchmark demonstrate that IDM outperforms previous state-of-the-art methods in terms of policy robustness and further improves the sample efficiency of the model-based method. Ablation experiment results verify the superiority of IDM. |
| Researcher Affiliation | Collaboration | Yao Mu The University of Hong Kong muyao@connect.hku.hk Yuzheng Zhuang Huawei Noah s Ark Lab zhuangyuzheng@huawei.com Bin Wang Huawei Noah s Ark Lab wangbin158@huawei.com Guangxiang Zhu Tsinghua University guangxiangzhu@outlook.com Wulong Liu Huawei Noah s Ark Lab liuwulong@huawei.com Jianyu Chen Tsinghua University jianyuchen@tsinghua.edu.cn Ping Luo The University of Hong Kong pluo@cs.hku.hk Shengbo Eben Li Tsinghua University lishbo@tsinghua.edu.cn Chongjie Zhang Tsinghua University chongjie@tsinghua.edu.cn Jianye Hao Huawei Noah s Ark Lab haojianye@huawei.com |
| Pseudocode | Yes | The IDM framework is implemented as Algorithm 1 (see Appendix A.2). |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | The experiments are implemented in DMControl tasks [18] with observation uncertainty. [...] We test the performance of IDM in DMControl environments without image uncertainty with 5 random seeds. |
| Dataset Splits | No | No explicit mention of training, validation, or test dataset splits (e.g., percentages or counts) was found. |
| Hardware Specification | Yes | The hardware setting is NVIDIA GeForce RTX 3090 GPU and 32GB RAM. |
| Software Dependencies | Yes | All the experiments were implemented with Pytorch 1.9.0, Python 3.8.5, and cuda 11.1. |
| Experiment Setup | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.2 |