Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach

Authors: Johan Peralez, Aurélien Delage, Olivier Buffet, Jilles Steeve Dibangoye

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section presents the outcomes of our experiments, which were carried out to juxtapose our findings with the leading-edge theory employed in global methods, encompassing the utilization of the PBVI algorithm as a standard algorithmic scheme. Our analysis involves three variants of the PBVI algorithm, namely PBVIenum, PBVImilp, and h PBVI, each employing distinct methods of performing point-based backups.
Researcher Affiliation Academia 1Universit e de Lyon, INSA Lyon and Inria, CITI, F-69000 Lyon 2Universit e de Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy 3 Bernoulli Institute, University of Groningen, Nijenborgh 4, NL9747 AG Groningen, Netherlands.
Pseudocode Yes Appendix A. The PBVI Algorithm: Algorithm 1 PBVI for M 1 under HIS. function PBVIpq Initialize S0: and V0:. while V0: has not converged do improvep V0:, S0:q. S0: Ð expandp S0:q. end while function improvep V0:, S0:q for τ ℓ 1 to 0 do for sτ P Sτ do Vτ Ð Vτ Y tbackuppsτ, Vτ 1qu. end for end for
Open Source Code No The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We have comprehensively assessed various algorithms using several two-player benchmarks sourced from academic literature, available at masplan.org. These benchmarks encompass mabc, recycling, grid3x3, boxpushing, mars, and tiger.
Dataset Splits No The paper conducts experiments on game-theoretic benchmarks (e.g., tiger, recycling) which are simulated environments and does not describe standard dataset splits (training, validation, test percentages or counts) as would be typical for machine learning tasks on static datasets.
Hardware Specification No The paper states 'The experiments were executed on an Ubuntu machine with 32GB of available RAM and a 2.5GHz processor, utilizing only one core, with a time limit of 30 minutes,' which describes general machine characteristics but does not provide specific CPU or GPU models.
Software Dependencies No The paper mentions 'We used ILOG CPLEX Optimization Studio to solve the MILPs' but does not provide a specific version number for CPLEX or any other software dependencies.
Experiment Setup Yes For each game(n) and algorithm, we report time (in seconds) per backup and the best value for horizon ℓ 30. The experiments were executed on an Ubuntu machine with 32GB of available RAM and a 2.5GHz processor, utilizing only one core, with a time limit of 30 minutes.