Factored Online Planning in Many-Agent POMDPs
Authors: Maris F. L. Galesloot, Thiago D. Simão, Sebastian Junges, Nils Jansen
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental evaluation against several state-ofthe-art baselines shows that our methods (1) are competitive in settings with only a few agents and (2) improve over the baselines in the presence of many agents. |
| Researcher Affiliation | Academia | 1Radboud University Nijmegen, The Netherlands 2Eindhoven University of Technology, The Netherlands 3Ruhr-University Bochum, Germany |
| Pseudocode | Yes | We provide the pseudo-code for the SIR filter that updates b in Appendix F.2. |
| Open Source Code | Yes | All algorithm variants are implemented in the same Python prototype, published online1. 1https://zenodo.org/records/10409525. |
| Open Datasets | Yes | Benchmarks. FIREFIGHTINGGRAPH (FFG, Oliehoek et al. 2008) has been used to evaluate factored POMCP (Amato and Oliehoek 2015). Agents stand in a line, and houses are located to the left and right of each agent. Multi-agent ROCKSAMPLE (MARS, Cai et al. 2021) extends single-agent Rock Sample (Smith and Simmons 2004). MARS environments are defined by their size m, the number of agents n, and the number of rocks k, with k = m = 15. In CAPTURETARGET (CT), agents are tasked with capturing a moving target. We depict results for CT in Appendix A.2. Detailed benchmark descriptions are in Appendix G. |
| Dataset Splits | No | The paper describes simulations and episodes for evaluation (“averaged over 100 episodes”), but it does not specify traditional train/validation/test dataset splits as it’s an online planning paper rather than one using static datasets. |
| Hardware Specification | Yes | All code ran on a machine with an Intel(R) Core(TM) i910980XE CPU @ 3.00GHz and 256 GB RAM (8 x 32GB DDR4-3200). |
| Software Dependencies | No | The paper mentions “All algorithm variants are implemented in the same Python prototype,” but it does not provide specific version numbers for Python or any associated libraries. |
| Experiment Setup | Yes | We did not run an extensive hyperparameter optimization for any algorithm, and we list the most important parameters in Tab. 2 of Appendix A.1. All algorithms ran with a maximum of 5s and 15s per step on FFG/CT and MARS, respectively. If the particle filter belief is deprived at any point in time during the episode, the policy defaults to a random policy. We set the number K of particles in the joint filters such that K = P e Ke in the factored filters, e.g., if we have three edges with Ke = 100, then the joint counterpart has K = 300. |