Recursive Small-Step Multi-Agent A* for Dec-POMDPs

Authors: Wietze Koops, Nils Jansen, Sebastian Junges, Thiago D. Simão

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 7 Empirical Evaluation This section provides an empirical evaluation of RS-MAA on a set of standard benchmarks, in comparison with GMAA -ICE [Oliehoek et al., 2013], the current state-of-the-art exact solver for Dec-POMDPs. Furthermore, we provide an ablation study, showing the effects of clustering, computing the best response for the last agent in the last stage directly, the choice of heuristic (Qd), and abandoning the computation of the recursive heuristic early based on a maximum number of iterations (M) or a threshold (α).
Researcher Affiliation Academia Wietze Koops , Nils Jansen , Sebastian Junges and Thiago D. Sim ao Radboud University, Nijmegen, The Netherlands {wietze.koops, nils.jansen, sebastian.junges, thiago.simao}@ru.nl
Pseudocode No The paper describes algorithmic concepts and definitions but does not provide structured pseudocode or algorithm blocks, nor does it label any section as 'Pseudocode' or 'Algorithm'.
Open Source Code Yes Proofs and source code are given in the supplementary material available at https://zenodo.org/record/7949016
Open Datasets Yes We used the standard benchmarks from the literature: DECTIGER [Nair et al., 2003], FIREFIGHTING [Oliehoek et al., 2008] (3 fire levels, 3 or 4 houses), GRID with two observations [Amato et al., 2006], BOXPUSHING [Seuken and Zilberstein, 2007], GRID3X3 [Amato et al., 2009], MARS [Amato and Zilberstein, 2009], HOTEL [Spaan and Melo, 2008], RECYCLING [Amato et al., 2007], and BROADCAST [Hansen et al., 2004].
Dataset Splits No The paper evaluates an exact algorithm for Dec-POMDPs on standard benchmarks. These benchmarks define problem instances rather than datasets with typical train/validation/test splits used in supervised learning. Therefore, the paper does not provide information about dataset splits for training, validation, or testing its algorithm.
Hardware Specification Yes All experiments were ran on 3.7GHz Intel Core i9 CPU running Ubuntu 22.04.1 LTS. We limited each process to 16GB of memory and 1 hour of CPU time.
Software Dependencies No The paper mentions 'Python 3' and 'PyPy' for RS-MAA, and 'C++ implementation in the MADP toolbox [Oliehoek et al., 2017]' for GMAA-ICE. However, it does not provide specific version numbers for PyPy, C++, or the MADP toolbox.
Experiment Setup Yes Hyperparameter selection. Our algorithm has three hyperparameters, d, M, α, discussed in Sec. 6. For most problems, Q1 and Q2 are too expensive to compute, but Q3 is a good compromise. For easier problems, Q also works well. In a preliminary evaluation using M {100, 200, 400} and α {0.1, 0.2}, we observed that M = 100 sometimes expands much more nodes than M = 200, but increasing M to 400 did not reduce the number of nodes (significantly) compared to M = 200, and hence typically increased running times. Hence, we set M = 200. The threshold α had a small effect, so we conservatively set it to 0.2.