Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Model-Based Reinforcement Learning via Imagination with Derived Memory
Authors: Yao Mu, Yuzheng Zhuang, Bin Wang, Guangxiang Zhu, Wulong Liu, Jianyu Chen, Ping Luo, Shengbo Li, Chongjie Zhang, Jianye Hao
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various high-dimensional visual control tasks in the DMControl benchmark demonstrate that IDM outperforms previous state-of-the-art methods in terms of policy robustness and further improves the sample efficiency of the model-based method. Ablation experiment results verify the superiority of IDM. |
| Researcher Affiliation | Collaboration | Yao Mu The University of Hong Kong EMAIL Yuzheng Zhuang Huawei Noah s Ark Lab EMAIL Bin Wang Huawei Noah s Ark Lab EMAIL Guangxiang Zhu Tsinghua University EMAIL Wulong Liu Huawei Noah s Ark Lab EMAIL Jianyu Chen Tsinghua University EMAIL Ping Luo The University of Hong Kong EMAIL Shengbo Eben Li Tsinghua University EMAIL Chongjie Zhang Tsinghua University EMAIL Jianye Hao Huawei Noah s Ark Lab EMAIL |
| Pseudocode | Yes | The IDM framework is implemented as Algorithm 1 (see Appendix A.2). |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | The experiments are implemented in DMControl tasks [18] with observation uncertainty. [...] We test the performance of IDM in DMControl environments without image uncertainty with 5 random seeds. |
| Dataset Splits | No | No explicit mention of training, validation, or test dataset splits (e.g., percentages or counts) was found. |
| Hardware Specification | Yes | The hardware setting is NVIDIA GeForce RTX 3090 GPU and 32GB RAM. |
| Software Dependencies | Yes | All the experiments were implemented with Pytorch 1.9.0, Python 3.8.5, and cuda 11.1. |
| Experiment Setup | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.2 |