reproducibilityindex.ai

Scalable Safe Policy Improvement for Factored Multi-Agent MDPs

Authors: Federico Bianchi, Edoardo Zorzi, Alberto Castellini, Thiago D. Simão, Matthijs T. J. Spaan, Alessandro Farinelli

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An empirical evaluation on multi-agent Sys Admin and multi-UAV Delivery shows that the approach scales to very large domains where state-of-the-art methods cannot work.
Researcher Affiliation	Academia	1Department of Computer Science, University of Verona, Verona, Italy 2Department of Software Science, Eindhoven University of Technology, Eindhoven, Netherlands 3Department of Intelligent Systems, Delft University of Technology, Delft, Netherlands.
Pseudocode	Yes	Algorithm 1 Factored-Value MCTS-SPIBB
Open Source Code	Yes	Code available at https://github.com/Isla-lab/fv-mcts-spibb
Open Datasets	Yes	Multi-agent Sys Admin is a standard MMDP benchmark (Guestrin et al., 2003). Multi-UAV Delivery was proposed in (Choudhury et al., 2021).
Dataset Splits	No	The paper does not explicitly provide information about a validation dataset split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup	Yes	For FV-MCTS-SPIBB-Max-Plus and FV-MCTS-SPIBB-Var-El, we use the following parameters: 100 simulations, an exploration constant empirically found to be best at c = n. (with n number of agents), MCTS tree depth of 20-steps, γ = 0.9, and 8 iterations of message passing in Max-Plus.