Optimally Solving Two-Agent Decentralized POMDPs Under One-Sided Information Sharing

Authors: Yuxuan Xie, Jilles Dibangoye, Olivier Buffet

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To support our findings, we provide three variants of the Heuristic Search Value Iteration (HSVI) algorithm (Smith, 2007) using either PWLC or linear value-function representations and compare them on standard problems from the literature. ... In all tested benchmarks, HSVI1 outperforms both HSVI2 and HSVI3, providing near-optimal (if not optimal) values at the initial state.
Researcher Affiliation Academia 1Univ Lyon, INSA Lyon, INRIA, CITI, F-69621 Villeurbanne, France 2Universit e de Lorraine, INRIA, CNRS, LORIA, F-54000 Nancy, France.
Pseudocode Yes Algorithm 1: The HSVI Algorithm for bo MDP M.
Open Source Code Yes The software and data we used to generate the experiments are available at https://gitlab.inria.fr/jdibango/osisdec-pomdps.
Open Datasets No The paper states 'All used domains are available at masplan.org' but does not provide direct links, DOIs, specific repository names for each dataset, or formal citations with authors and year for the individual datasets.
Dataset Splits No The paper does not provide specific training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit references to standard predefined splits with citations).
Hardware Specification Yes We ran our variants of HSVI algorithm on an Ubuntu machine with 3.0GHz Xeon E5 CPU and 32GB available RAM.
Software Dependencies No The paper states 'We solved the MILPs using ILOG CPLEX Optimization Studio,' but does not specify a version number for the software.
Experiment Setup Yes For each of them we compare our variants of HSVI for planning horizon ℓ= 10 and discount factor γ = 1 and report different statistics, i.e. time, memory, number of trials, value, and gap. We set the time limit at 5 hours.