Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Semi-Supervised Imitation Learning of Team Policies from Suboptimal Demonstrations
Authors: Sangwon Seo, Vaibhav V. Unhelkar
IJCAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate and benchmark the performance of BTIL on synthetic multi-agent tasks as well as a novel dataset of human-agent teamwork. Our experiments show that BTIL can successfully learn team policies from demonstrations despite the in๏ฌuence of team members (time-varying and potentially misaligned) mental states on their behavior. |
| Researcher Affiliation | Academia | Sangwon Seo and Vaibhav V. Unhelkar Rice University EMAIL |
| Pseudocode | Yes | Algorithm 1 Bayesian Team Imitation Learner (BTIL) |
| Open Source Code | No | No explicit statement providing open-source code for the methodology or a direct link to a code repository was found. |
| Open Datasets | No | No concrete access information (specific link, DOI, repository name, formal citation) for a publicly available or open dataset was provided for either the synthetic or human-AI teamwork datasets created by the authors. |
| Dataset Splits | No | For each domain, we generate 200 demonstrations for training and 100 for evaluation. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running the experiments were provided. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1'). |
| Experiment Setup | No | The paper mentions that 'Implementation details of BTIL and the baselines are provided in Appendix D' but does not include specific hyperparameter values or detailed training configurations in the provided text. |