Meta-Inverse Reinforcement Learning with Probabilistic Context Variables
Authors: Lantao Yu, Tianhe Yu, Chelsea Finn, Stefano Ermon
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on multiple continuous control tasks demonstrate the effectiveness of our approach compared to state-of-the-art imitation and inverse reinforcement learning methods. |
| Researcher Affiliation | Academia | Lantao Yu , Tianhe Yu , Chelsea Finn, Stefano Ermon Department of Computer Science, Stanford University Stanford, CA 94305 {lantaoyu,tianheyu,cbfinn,ermon}@cs.stanford.edu |
| Pseudocode | Yes | Algorithm 1 PEMIRL Meta-Training |
| Open Source Code | Yes | Full video results are on the anonymous supplementary website2 and our code is open-sourced on Git Hub3. Our implementation of PEMIRL can be found at: https://github.com/ermongroup/Meta IRL |
| Open Datasets | No | We collect demonstrations by training experts with TRPO using ground truth reward. |
| Dataset Splits | No | The paper mentions 'meta-training set' and 'meta-test time' but does not provide specific details on dataset splits (e.g., percentages or counts) for training, validation, or testing. |
| Hardware Specification | No | The paper uses the Mujoco physics engine for simulations but does not specify any particular hardware (GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Mujoco physics engine and TRPO but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | No | We provide full hyperparameters, architecture information, data efficiency, and experimental setup details in Appendix F. |