Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Authors: Lantao Yu, Tianhe Yu, Chelsea Finn, Stefano Ermon

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on multiple continuous control tasks demonstrate the effectiveness of our approach compared to state-of-the-art imitation and inverse reinforcement learning methods.
Researcher Affiliation Academia Lantao Yu , Tianhe Yu , Chelsea Finn, Stefano Ermon Department of Computer Science, Stanford University Stanford, CA 94305 {lantaoyu,tianheyu,cbfinn,ermon}@cs.stanford.edu
Pseudocode Yes Algorithm 1 PEMIRL Meta-Training
Open Source Code Yes Full video results are on the anonymous supplementary website2 and our code is open-sourced on Git Hub3. Our implementation of PEMIRL can be found at: https://github.com/ermongroup/Meta IRL
Open Datasets No We collect demonstrations by training experts with TRPO using ground truth reward.
Dataset Splits No The paper mentions 'meta-training set' and 'meta-test time' but does not provide specific details on dataset splits (e.g., percentages or counts) for training, validation, or testing.
Hardware Specification No The paper uses the Mujoco physics engine for simulations but does not specify any particular hardware (GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using the Mujoco physics engine and TRPO but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup No We provide full hyperparameters, architecture information, data efficiency, and experimental setup details in Appendix F.