Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Model-based Adversarial Meta-Reinforcement Learning
Authors: Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks, the generalization power to out-of-distribution tasks, and in training and test time sample efficiency, over existing state-of-the-art meta-RL algorithms. |
| Researcher Affiliation | Academia | Zichuan Lin Tsinghua University EMAIL Garrett Thomas Stanford University EMAIL Guangwen Yang Tsinghua University EMAIL Tengyu Ma Stanford University EMAIL |
| Pseudocode | Yes | Algorithm 1 gives pseudo-code for our algorithm Ad MRL, which alternates the updates of dynamics b Tφ and tasks ψ. |
| Open Source Code | Yes | Our code is available at https://github.com/LinZichuan/AdMRL. |
| Open Datasets | Yes | We evaluate our approach on a variety of continuous control tasks based on Open AI gym [4], which uses the Mu Jo Co physics simulator [51]. |
| Dataset Splits | No | The paper mentions 'training tasks' and 'test tasks' but does not explicitly describe a validation set or its split details. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions software like 'Open AI gym' and 'Mu Jo Co physics simulator' but does not provide specific version numbers for these or other ancillary software components. |
| Experiment Setup | Yes | Most hyper-parameters are taken directly from the supplied implementation. We list all the hyper-parameters used for all algorithms in the Appendix C. |