α-min: A Compact Approximate Solver For Finite-Horizon POMDPs

Authors: Yann Dujardin, Tom Dietterich, Iadine Chades

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that α-min provides good approximate solutions given a fixed number of α-vectors on small benchmark problems, on a larger randomly generated problem, as well as on a computational sustainability problem to best manage the endangered Sumatran tiger.5 Experiments We first assess the performance of α-min on four small finitehorizon POMDP problems from the literature2and a larger randomly generated problem (random30). We compare its performance to a leading infinite-horizon POMDP solver Sarsop [Kurniawati et al., 2008]. ... Overall, the performance of α-min is encouraging with surprisingly good gaps obtained considering the small quantity of α-vectors (Table 1).
Researcher Affiliation Collaboration Yann Dujardin CSIRO yann.dujardin@csiro.au Tom Dietterich School of EECS Oregon State University tgd@oregonstate.edu Iadine Chad es CSIRO iadine.chades@csiro.au
Pseudocode Yes Algorithm 1 Find the best belief for expanding Bt according to Equation 6 to within specified precision ϵp, Algorithm 2 ϵ-min: Solve POMDP with a maximum gap ϵ, to within precision ϵp, Algorithm 3 α-min: Solve POMDP with a maximum number N of α-vectors, to within precision ϵp
Open Source Code No No explicit statement or clear link to the source code for the methodology described in this paper was found. Footnote 1 and 3 link to 'https://sites.google.com/site/ijcaialphamin/home' and state 'POMDP files corresponding to this problem', which does not explicitly indicate the algorithms' source code.
Open Datasets Yes We first assess the performance of α-min on four small finitehorizon POMDP problems from the literature2 and a larger randomly generated problem (random30). [Footnote 2: http://pomdp.org/examples/index.shtml]Interested readers can refer to the supplementary material3 for the POMDP files corresponding to this problem.
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits. It discusses POMDP problems and a finite-horizon setup, but not in terms of data splitting for model training/evaluation.
Hardware Specification Yes α-min results were obtained using a fixed number of α-vectors set arbitrarily with a maximum computational time of 1000s per time-step on a 94.4 GB, 3.47GHz, 19 cores computer and CPLEX 12.5.
Software Dependencies Yes α-min results were obtained using a fixed number of α-vectors set arbitrarily with a maximum computational time of 1000s per time-step on a 94.4 GB, 3.47GHz, 19 cores computer and CPLEX 12.5.
Experiment Setup Yes α-min results were obtained using a fixed number of α-vectors set arbitrarily with a maximum computational time of 1000s per time-step on a 94.4 GB, 3.47GHz, 19 cores computer and CPLEX 12.5.Sarsop results were obtained by solving the POMDPs over an infinite-horizon with γ = 0.999 and a maximum computational time of 1000s. Sarsop lower bounds (LB) were calculated as the expected sum of rewards cumulated over T time-steps by simulation of the infinite policy using a γ=1.