Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations
Authors: Sangwon Seo, Vaibhav V. Unhelkar
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate and benchmark the performance of BTIL on synthetic multi-agent tasks as well as a novel dataset of human-agent teamwork. Our experiments show that BTIL can successfully learn team policies from demonstrations despite the influence of team members (time-varying and potentially misaligned) mental states on their behavior. |
| Researcher Affiliation | Academia | Sangwon Seo and Vaibhav V. Unhelkar Rice University {sangwon.seo, vaibhav.unhelkar}@rice.edu |
| Pseudocode | Yes | Algorithm 1 Bayesian Team Imitation Learner (BTIL) |
| Open Source Code | No | No explicit statement providing open-source code for the methodology or a direct link to a code repository was found. |
| Open Datasets | No | No concrete access information (specific link, DOI, repository name, formal citation) for a publicly available or open dataset was provided for either the synthetic or human-AI teamwork datasets created by the authors. |
| Dataset Splits | No | For each domain, we generate 200 demonstrations for training and 100 for evaluation. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running the experiments were provided. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1'). |
| Experiment Setup | No | The paper mentions that 'Implementation details of BTIL and the baselines are provided in Appendix D' but does not include specific hyperparameter values or detailed training configurations in the provided text. |