Information Gathering and Reward Exploitation of Subgoals for POMDPs

Authors: Hang Ma, Joelle Pineau

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that IGRES is an effective multi-purpose POMDP solver, providing state-of-the-art performance for both long horizon planning tasks and information-gathering tasks on benchmark domains. Additional experiments with an ecological adaptive management problem indicate that IGRES is a promising tool for POMDP planning in real-world settings.
Researcher Affiliation Academia Hang Ma and Joelle Pineau School of Computer Science Mc Gill University, Montreal, Canada
Pseudocode Yes Algorithm 1 Backup(Γ,b)
Open Source Code No Our code will be publicly released to help future efforts.1 (Footnote: 1The software package will be available at: http://cs.mcgill.ca/%7Ehma41/IGRES/.)
Open Datasets Yes We first consider classic POMDP benchmark problems of various sizes and types, and then present results for a real-world challenge domain. Next, we apply IGRES to a class of ecological adaptive management problems (Nicol et al. 2013) that was presented as an IJCAI 2013 data challenge problem to the POMDP community.
Dataset Splits No The paper describes running simulations to evaluate policies but does not provide specific details on training, validation, or test dataset splits or cross-validation methodology.
Hardware Specification Yes We performed these experiments on a computer with a 2.50GHz Intel Core i5-2450M processor and 6GB of memory. Our results are generated on a 2.67GHz Intel Xeon W3520 computer with 8GB of memory.
Software Dependencies Yes For HSVI2, we use the latest ZMDP version 1.1.7 (http: //longhorizon.org/trey/zmdp/). For SARSOP, we use the latest APPL version 0.95 (http://bigbird.comp.nus.edu.sg/pmwiki/farm/appl/index.php?n=Main.Download).
Experiment Setup Yes The number of subgoals for IGRES is randomly picked roughly according to the size of each domain. For example, in Table 1, for the Tiger domain, IGRES used 1 subgoal, and for Hallway2, it used 20 subgoals.