Explainable Multi-Agent Reinforcement Learning for Temporal Queries
Authors: Kayla Boggess, Sarit Kraus, Lu Feng
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have successfully applied the proposed approach to four benchmark MARL domains (up to 9 agents in one domain). Moreover, the results of a user study show that the generated explanations significantly improve user performance and satisfaction. |
| Researcher Affiliation | Academia | Kayla Boggess1 , Sarit Kraus2 and Lu Feng1 1University of Virginia 2Bar-Ilan University {kjb5we, lu.feng}@virginia.edu, sarit@cs.biu.ac.il |
| Pseudocode | Yes | Algorithm 1 Checking the feasibility of a user query, Algorithm 2 Guided rollout, Algorithm 3 Generating reconciliation explanations |
| Open Source Code | Yes | Code available at github.com/kjboggess/ijcai23 |
| Open Datasets | Yes | (1) Search and Rescue (SR), (2) Level-Based Foraging (LBF) [Papoudakis et al., 2021], (3) Multi-Robot Warehouse (RWARE) [Papoudakis et al., 2021], (4) Pressure Plate (PLATE) [Mc Inroe and Christianos, 2022] |
| Dataset Splits | No | The paper mentions 'All models were trained and evaluated until converging to the expected reward, or up to 10,000 steps, whichever occurred first' but does not specify explicit train/validation/test splits. |
| Hardware Specification | Yes | The experiments were run on a machine with 2.1 GHz Intel CPU, 132 GB of memory, and Cent OS 7 operating system. |
| Software Dependencies | No | Our prototype implementation used the Shared Experience Actor-Critic [Christianos et al., 2020] for MARL policy training and evaluation. The PRISM probabilistic model checker [Kwiatkowska et al., 2011] was applied for checking the feasibility of user queries. Specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | We set the guided rollout parameters as Rollout Num = 10 and Depth Limit = 50. |