Interactive Teaching Algorithms for Inverse Reinforcement Learning
Authors: Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments with a car driving simulator environment show that the learning progress can be speeded up drastically as compared to an uninformative teacher. (...) 6 Experimental Evaluation |
| Researcher Affiliation | Academia | 1LIONS, EPFL 2Max Planck Institute for Software Systems (MPI-SWS) |
| Pseudocode | Yes | Algorithm 1 Interactive Teaching Framework (...) Algorithm 2 Sequential MCE-IRL (...) Algorithm 3 OMNITEACHER for sequential MCE-IRL (...) Algorithm 4 BBOXTEACHER for a sequential IRL learner |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper describes the creation of a 'car driving simulator environment' and defines 'tasks' within it, which are used to generate demonstrations for training. However, it does not provide concrete access information (link, DOI, formal citation for a public dataset) to the generated demonstrations or the simulator environment itself as a publicly available dataset. |
| Dataset Splits | No | The paper does not explicitly describe a validation set or a standard train/validation/test split for its experimental data. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies, libraries, or programming languages used in the experiments. |
| Experiment Setup | Yes | For BBOXTEACHER in Algorithm 4, we use B = 5 and k = 5. (...) We use n = 5 lanes of each task (i.e., 40 lanes in total). (...) We use similar experimental settings as in Section 6.1 (i.e., n = 5, averaging 10 runs, etc.). |