Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Model-Based Reinforcement Learning via Imagination with Derived Memory

Authors: Yao Mu, Yuzheng Zhuang, Bin Wang, Guangxiang Zhu, Wulong Liu, Jianyu Chen, Ping Luo, Shengbo Li, Chongjie Zhang, Jianye Hao

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various high-dimensional visual control tasks in the DMControl benchmark demonstrate that IDM outperforms previous state-of-the-art methods in terms of policy robustness and further improves the sample efficiency of the model-based method. Ablation experiment results verify the superiority of IDM.
Researcher Affiliation Collaboration Yao Mu The University of Hong Kong EMAIL Yuzheng Zhuang Huawei Noah s Ark Lab EMAIL Bin Wang Huawei Noah s Ark Lab EMAIL Guangxiang Zhu Tsinghua University EMAIL Wulong Liu Huawei Noah s Ark Lab EMAIL Jianyu Chen Tsinghua University EMAIL Ping Luo The University of Hong Kong EMAIL Shengbo Eben Li Tsinghua University EMAIL Chongjie Zhang Tsinghua University EMAIL Jianye Hao Huawei Noah s Ark Lab EMAIL
Pseudocode Yes The IDM framework is implemented as Algorithm 1 (see Appendix A.2).
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes The experiments are implemented in DMControl tasks [18] with observation uncertainty. [...] We test the performance of IDM in DMControl environments without image uncertainty with 5 random seeds.
Dataset Splits No No explicit mention of training, validation, or test dataset splits (e.g., percentages or counts) was found.
Hardware Specification Yes The hardware setting is NVIDIA GeForce RTX 3090 GPU and 32GB RAM.
Software Dependencies Yes All the experiments were implemented with Pytorch 1.9.0, Python 3.8.5, and cuda 11.1.
Experiment Setup Yes Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.2