Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Scalable Safe Policy Improvement for Factored Multi-Agent MDPs
Authors: Federico Bianchi, Edoardo Zorzi, Alberto Castellini, Thiago D. Simão, Matthijs T. J. Spaan, Alessandro Farinelli
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical evaluation on multi-agent Sys Admin and multi-UAV Delivery shows that the approach scales to very large domains where state-of-the-art methods cannot work. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Verona, Verona, Italy 2Department of Software Science, Eindhoven University of Technology, Eindhoven, Netherlands 3Department of Intelligent Systems, Delft University of Technology, Delft, Netherlands. |
| Pseudocode | Yes | Algorithm 1 Factored-Value MCTS-SPIBB |
| Open Source Code | Yes | Code available at https://github.com/Isla-lab/fv-mcts-spibb |
| Open Datasets | Yes | Multi-agent Sys Admin is a standard MMDP benchmark (Guestrin et al., 2003). Multi-UAV Delivery was proposed in (Choudhury et al., 2021). |
| Dataset Splits | No | The paper does not explicitly provide information about a validation dataset split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | For FV-MCTS-SPIBB-Max-Plus and FV-MCTS-SPIBB-Var-El, we use the following parameters: 100 simulations, an exploration constant empirically found to be best at c = n. (with n number of agents), MCTS tree depth of 20-steps, γ = 0.9, and 8 iterations of message passing in Max-Plus. |