Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments
Authors: Hugo Caselles-Dupré, Olivier Sigaud, Mohamed CHETOUANI
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that combining BGI-agents (a pedagogical teacher and a pragmatic learner) results in faster learning and reduced goal ambiguity over standard learning from demonstrations, especially in the few demonstrations regime. We provide the code for our experiments 1, as well as an illustrative video explaining our approach 2. |
| Researcher Affiliation | Academia | Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani Sorbonne Université, CNRS, Institut des Systèmes Intelligents et de Robotique (ISIR) Paris, France casellesdupre.hugo@gmail.com,olivier.sigaud,mohamed.chetouani@isir.upmc.fr |
| Pseudocode | Yes | Algorithm 1 Two-phases training of the teacher and the learner |
| Open Source Code | Yes | We provide the code for our experiments 1, as well as an illustrative video explaining our approach 2. 1https://github.com/Caselles/NeurIPS22-demonstrations-pedagogy-pragmatism |
| Open Datasets | Yes | FBS is a block-stacking environment with two Fetch robots (teacher and learner) equipped with robotic arms, see Fig. 2. It is based on Mu Jo Co [41] and derived from the Fetch tasks [36]. |
| Dataset Splits | No | The paper describes testing procedures and splits for evaluation metrics (GIA, OGIA, GRA) but does not provide explicit training/validation/test dataset splits in terms of percentages or counts for a fixed dataset, as is common in supervised learning. For reinforcement learning, data is typically generated iteratively rather than from a pre-defined static dataset with fixed splits. |
| Hardware Specification | No | The paper states |
| Software Dependencies | No | The paper mentions several software components and algorithms such as |
| Experiment Setup | Yes | All architecture, training and hyperparameters details are provided in Appendix B.2. ... Additionally, the pedagogical teacher rewards itself with the |