Scalable Safe Policy Improvement for Factored Multi-Agent MDPs
Authors: Federico Bianchi, Edoardo Zorzi, Alberto Castellini, Thiago D. Simão, Matthijs T. J. Spaan, Alessandro Farinelli
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical evaluation on multi-agent Sys Admin and multi-UAV Delivery shows that the approach scales to very large domains where state-of-the-art methods cannot work. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Verona, Verona, Italy 2Department of Software Science, Eindhoven University of Technology, Eindhoven, Netherlands 3Department of Intelligent Systems, Delft University of Technology, Delft, Netherlands. |
| Pseudocode | Yes | Algorithm 1 Factored-Value MCTS-SPIBB |
| Open Source Code | Yes | Code available at https://github.com/Isla-lab/fv-mcts-spibb |
| Open Datasets | Yes | Multi-agent Sys Admin is a standard MMDP benchmark (Guestrin et al., 2003). Multi-UAV Delivery was proposed in (Choudhury et al., 2021). |
| Dataset Splits | No | The paper does not explicitly provide information about a validation dataset split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | For FV-MCTS-SPIBB-Max-Plus and FV-MCTS-SPIBB-Var-El, we use the following parameters: 100 simulations, an exploration constant empirically found to be best at c = n. (with n number of agents), MCTS tree depth of 20-steps, γ = 0.9, and 8 iterations of message passing in Max-Plus. |