Model-based Adversarial Meta-Reinforcement Learning

Authors: Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks, the generalization power to out-of-distribution tasks, and in training and test time sample efficiency, over existing state-of-the-art meta-RL algorithms.
Researcher Affiliation Academia Zichuan Lin Tsinghua University lzcthu12@gmail.com Garrett Thomas Stanford University gwthomas@stanford.edu Guangwen Yang Tsinghua University ygw@tsinghua.edu.cn Tengyu Ma Stanford University tengyuma@stanford.edu
Pseudocode Yes Algorithm 1 gives pseudo-code for our algorithm Ad MRL, which alternates the updates of dynamics b Tφ and tasks ψ.
Open Source Code Yes Our code is available at https://github.com/LinZichuan/AdMRL.
Open Datasets Yes We evaluate our approach on a variety of continuous control tasks based on Open AI gym [4], which uses the Mu Jo Co physics simulator [51].
Dataset Splits No The paper mentions 'training tasks' and 'test tasks' but does not explicitly describe a validation set or its split details.
Hardware Specification No The paper does not provide specific details about the hardware used, such as CPU or GPU models, or memory specifications.
Software Dependencies No The paper mentions software like 'Open AI gym' and 'Mu Jo Co physics simulator' but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup Yes Most hyper-parameters are taken directly from the supplied implementation. We list all the hyper-parameters used for all algorithms in the Appendix C.