Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach
Authors: Johan Peralez, Aurélien Delage, Olivier Buffet, Jilles Steeve Dibangoye
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This section presents the outcomes of our experiments, which were carried out to juxtapose our findings with the leading-edge theory employed in global methods, encompassing the utilization of the PBVI algorithm as a standard algorithmic scheme. Our analysis involves three variants of the PBVI algorithm, namely PBVIenum, PBVImilp, and h PBVI, each employing distinct methods of performing point-based backups. |
| Researcher Affiliation | Academia | 1Universit e de Lyon, INSA Lyon and Inria, CITI, F-69000 Lyon 2Universit e de Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy 3 Bernoulli Institute, University of Groningen, Nijenborgh 4, NL9747 AG Groningen, Netherlands. |
| Pseudocode | Yes | Appendix A. The PBVI Algorithm: Algorithm 1 PBVI for M 1 under HIS. function PBVIpq Initialize S0: and V0:. while V0: has not converged do improvep V0:, S0:q. S0: Ð expandp S0:q. end while function improvep V0:, S0:q for τ ℓ 1 to 0 do for sτ P Sτ do Vτ Ð Vτ Y tbackuppsτ, Vτ 1qu. end for end for |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We have comprehensively assessed various algorithms using several two-player benchmarks sourced from academic literature, available at masplan.org. These benchmarks encompass mabc, recycling, grid3x3, boxpushing, mars, and tiger. |
| Dataset Splits | No | The paper conducts experiments on game-theoretic benchmarks (e.g., tiger, recycling) which are simulated environments and does not describe standard dataset splits (training, validation, test percentages or counts) as would be typical for machine learning tasks on static datasets. |
| Hardware Specification | No | The paper states 'The experiments were executed on an Ubuntu machine with 32GB of available RAM and a 2.5GHz processor, utilizing only one core, with a time limit of 30 minutes,' which describes general machine characteristics but does not provide specific CPU or GPU models. |
| Software Dependencies | No | The paper mentions 'We used ILOG CPLEX Optimization Studio to solve the MILPs' but does not provide a specific version number for CPLEX or any other software dependencies. |
| Experiment Setup | Yes | For each game(n) and algorithm, we report time (in seconds) per backup and the best value for horizon ℓ 30. The experiments were executed on an Ubuntu machine with 32GB of available RAM and a 2.5GHz processor, utilizing only one core, with a time limit of 30 minutes. |