Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations

Authors: Sangwon Seo, Vaibhav V. Unhelkar

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate and benchmark the performance of BTIL on synthetic multi-agent tasks as well as a novel dataset of human-agent teamwork. Our experiments show that BTIL can successfully learn team policies from demonstrations despite the influence of team members (time-varying and potentially misaligned) mental states on their behavior.
Researcher Affiliation Academia Sangwon Seo and Vaibhav V. Unhelkar Rice University {sangwon.seo, vaibhav.unhelkar}@rice.edu
Pseudocode Yes Algorithm 1 Bayesian Team Imitation Learner (BTIL)
Open Source Code No No explicit statement providing open-source code for the methodology or a direct link to a code repository was found.
Open Datasets No No concrete access information (specific link, DOI, repository name, formal citation) for a publicly available or open dataset was provided for either the synthetic or human-AI teamwork datasets created by the authors.
Dataset Splits No For each domain, we generate 200 demonstrations for training and 100 for evaluation.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running the experiments were provided.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1').
Experiment Setup No The paper mentions that 'Implementation details of BTIL and the baselines are provided in Appendix D' but does not include specific hyperparameter values or detailed training configurations in the provided text.