Model-based Adversarial Meta-Reinforcement Learning
Authors: Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks, the generalization power to out-of-distribution tasks, and in training and test time sample efficiency, over existing state-of-the-art meta-RL algorithms. |
| Researcher Affiliation | Academia | Zichuan Lin Tsinghua University lzcthu12@gmail.com Garrett Thomas Stanford University gwthomas@stanford.edu Guangwen Yang Tsinghua University ygw@tsinghua.edu.cn Tengyu Ma Stanford University tengyuma@stanford.edu |
| Pseudocode | Yes | Algorithm 1 gives pseudo-code for our algorithm Ad MRL, which alternates the updates of dynamics b Tφ and tasks ψ. |
| Open Source Code | Yes | Our code is available at https://github.com/LinZichuan/AdMRL. |
| Open Datasets | Yes | We evaluate our approach on a variety of continuous control tasks based on Open AI gym [4], which uses the Mu Jo Co physics simulator [51]. |
| Dataset Splits | No | The paper mentions 'training tasks' and 'test tasks' but does not explicitly describe a validation set or its split details. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions software like 'Open AI gym' and 'Mu Jo Co physics simulator' but does not provide specific version numbers for these or other ancillary software components. |
| Experiment Setup | Yes | Most hyper-parameters are taken directly from the supplied implementation. We list all the hyper-parameters used for all algorithms in the Appendix C. |