Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
Authors: Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our core hypothesis by comparing learning performance on a new tasks starting from the initialization produced by Mand RIL with learning a separate model for every task starting either from a random initialization or from an initialization obtained by supervised pre-training. We refer to these approaches as learning FROM SCRATCH and AVERAGE GRADIENT pretraining respectively. ... We consider two environments: (1) an image-based navigation task with an aerial viewpoint, (2) a first-person navigation task in a simulated home environment with object interaction. |
| Researcher Affiliation | Academia | 1Department of Electrical Engineering and Computer Science, University of California, Berkeley, USA. |
| Pseudocode | Yes | Algorithm 1 Meta Reward and Intention Learning (Mand RIL) |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | We use an environment built on top of the SUNCG dataset (Song et al., 2017) |
| Dataset Splits | Yes | The meta-train set is composed of 1004 tasks, the TEST set contains 236 tasks, and the UNSEEN-HOUSES set contains 173 tasks. |
| Hardware Specification | No | The paper does not specify the exact hardware components (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | The paper mentions various software components and frameworks but does not provide specific version numbers for reproducibility. |
| Experiment Setup | No | We describe here the environments and evaluation protocol and provide detailed experimental settings and hyperparameters for both domains in Appendices A and B. |