Recursive Small-Step Multi-Agent A* for Dec-POMDPs
Authors: Wietze Koops, Nils Jansen, Sebastian Junges, Thiago D. Simão
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 7 Empirical Evaluation This section provides an empirical evaluation of RS-MAA on a set of standard benchmarks, in comparison with GMAA -ICE [Oliehoek et al., 2013], the current state-of-the-art exact solver for Dec-POMDPs. Furthermore, we provide an ablation study, showing the effects of clustering, computing the best response for the last agent in the last stage directly, the choice of heuristic (Qd), and abandoning the computation of the recursive heuristic early based on a maximum number of iterations (M) or a threshold (α). |
| Researcher Affiliation | Academia | Wietze Koops , Nils Jansen , Sebastian Junges and Thiago D. Sim ao Radboud University, Nijmegen, The Netherlands {wietze.koops, nils.jansen, sebastian.junges, thiago.simao}@ru.nl |
| Pseudocode | No | The paper describes algorithmic concepts and definitions but does not provide structured pseudocode or algorithm blocks, nor does it label any section as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Proofs and source code are given in the supplementary material available at https://zenodo.org/record/7949016 |
| Open Datasets | Yes | We used the standard benchmarks from the literature: DECTIGER [Nair et al., 2003], FIREFIGHTING [Oliehoek et al., 2008] (3 fire levels, 3 or 4 houses), GRID with two observations [Amato et al., 2006], BOXPUSHING [Seuken and Zilberstein, 2007], GRID3X3 [Amato et al., 2009], MARS [Amato and Zilberstein, 2009], HOTEL [Spaan and Melo, 2008], RECYCLING [Amato et al., 2007], and BROADCAST [Hansen et al., 2004]. |
| Dataset Splits | No | The paper evaluates an exact algorithm for Dec-POMDPs on standard benchmarks. These benchmarks define problem instances rather than datasets with typical train/validation/test splits used in supervised learning. Therefore, the paper does not provide information about dataset splits for training, validation, or testing its algorithm. |
| Hardware Specification | Yes | All experiments were ran on 3.7GHz Intel Core i9 CPU running Ubuntu 22.04.1 LTS. We limited each process to 16GB of memory and 1 hour of CPU time. |
| Software Dependencies | No | The paper mentions 'Python 3' and 'PyPy' for RS-MAA, and 'C++ implementation in the MADP toolbox [Oliehoek et al., 2017]' for GMAA-ICE. However, it does not provide specific version numbers for PyPy, C++, or the MADP toolbox. |
| Experiment Setup | Yes | Hyperparameter selection. Our algorithm has three hyperparameters, d, M, α, discussed in Sec. 6. For most problems, Q1 and Q2 are too expensive to compute, but Q3 is a good compromise. For easier problems, Q also works well. In a preliminary evaluation using M {100, 200, 400} and α {0.1, 0.2}, we observed that M = 100 sometimes expands much more nodes than M = 200, but increasing M to 400 did not reduce the number of nodes (significantly) compared to M = 200, and hence typically increased running times. Hence, we set M = 200. The threshold α had a small effect, so we conservatively set it to 0.2. |