Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Meta-Inverse Reinforcement Learning with Probabilistic Context Variables
Authors: Lantao Yu, Tianhe Yu, Chelsea Finn, Stefano Ermon
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on multiple continuous control tasks demonstrate the effectiveness of our approach compared to state-of-the-art imitation and inverse reinforcement learning methods. |
| Researcher Affiliation | Academia | Lantao Yu , Tianhe Yu , Chelsea Finn, Stefano Ermon Department of Computer Science, Stanford University Stanford, CA 94305 EMAIL |
| Pseudocode | Yes | Algorithm 1 PEMIRL Meta-Training |
| Open Source Code | Yes | Full video results are on the anonymous supplementary website2 and our code is open-sourced on Git Hub3. Our implementation of PEMIRL can be found at: https://github.com/ermongroup/Meta IRL |
| Open Datasets | No | We collect demonstrations by training experts with TRPO using ground truth reward. |
| Dataset Splits | No | The paper mentions 'meta-training set' and 'meta-test time' but does not provide specific details on dataset splits (e.g., percentages or counts) for training, validation, or testing. |
| Hardware Specification | No | The paper uses the Mujoco physics engine for simulations but does not specify any particular hardware (GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Mujoco physics engine and TRPO but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | No | We provide full hyperparameters, architecture information, data ef๏ฌciency, and experimental setup details in Appendix F. |