Explainable Multi-Agent Reinforcement Learning for Temporal Queries

Authors: Kayla Boggess, Sarit Kraus, Lu Feng

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have successfully applied the proposed approach to four benchmark MARL domains (up to 9 agents in one domain). Moreover, the results of a user study show that the generated explanations significantly improve user performance and satisfaction.
Researcher Affiliation Academia Kayla Boggess1 , Sarit Kraus2 and Lu Feng1 1University of Virginia 2Bar-Ilan University {kjb5we, lu.feng}@virginia.edu, sarit@cs.biu.ac.il
Pseudocode Yes Algorithm 1 Checking the feasibility of a user query, Algorithm 2 Guided rollout, Algorithm 3 Generating reconciliation explanations
Open Source Code Yes Code available at github.com/kjboggess/ijcai23
Open Datasets Yes (1) Search and Rescue (SR), (2) Level-Based Foraging (LBF) [Papoudakis et al., 2021], (3) Multi-Robot Warehouse (RWARE) [Papoudakis et al., 2021], (4) Pressure Plate (PLATE) [Mc Inroe and Christianos, 2022]
Dataset Splits No The paper mentions 'All models were trained and evaluated until converging to the expected reward, or up to 10,000 steps, whichever occurred first' but does not specify explicit train/validation/test splits.
Hardware Specification Yes The experiments were run on a machine with 2.1 GHz Intel CPU, 132 GB of memory, and Cent OS 7 operating system.
Software Dependencies No Our prototype implementation used the Shared Experience Actor-Critic [Christianos et al., 2020] for MARL policy training and evaluation. The PRISM probabilistic model checker [Kwiatkowska et al., 2011] was applied for checking the feasibility of user queries. Specific version numbers for these software components are not provided.
Experiment Setup Yes We set the guided rollout parameters as Rollout Num = 10 and Depth Limit = 50.