Learning a Prior over Intent via Meta-Inverse Reinforcement Learning

Authors: Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our core hypothesis by comparing learning performance on a new tasks starting from the initialization produced by Mand RIL with learning a separate model for every task starting either from a random initialization or from an initialization obtained by supervised pre-training. We refer to these approaches as learning FROM SCRATCH and AVERAGE GRADIENT pretraining respectively. ... We consider two environments: (1) an image-based navigation task with an aerial viewpoint, (2) a first-person navigation task in a simulated home environment with object interaction.
Researcher Affiliation Academia 1Department of Electrical Engineering and Computer Science, University of California, Berkeley, USA.
Pseudocode Yes Algorithm 1 Meta Reward and Intention Learning (Mand RIL)
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets Yes We use an environment built on top of the SUNCG dataset (Song et al., 2017)
Dataset Splits Yes The meta-train set is composed of 1004 tasks, the TEST set contains 236 tasks, and the UNSEEN-HOUSES set contains 173 tasks.
Hardware Specification No The paper does not specify the exact hardware components (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper mentions various software components and frameworks but does not provide specific version numbers for reproducibility.
Experiment Setup No We describe here the environments and evaluation protocol and provide detailed experimental settings and hyperparameters for both domains in Appendices A and B.