Real-Time Planning as Decision-Making under Uncertainty

Authors: Andrew Mitchell, Wheeler Ruml, Fabian Spaniol, Jorg Hoffmann, Marek Petrik2338-2345

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate these methods in a simple synthetic benchmark and the sliding tile puzzle and find that they outperform previous methods.
Researcher Affiliation Academia Andrew Mitchell,1 Wheeler Ruml,1 Fabian Spaniol,2 J org Hoffmann,2 Marek Petrik1 1Department of Computer Science, University of New Hampshire, USA 2Department of Computer Science, Saarland University, Germany
Pseudocode No No explicitly labeled pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We use as benchmarks both random trees (following Pemberton and Korf (1994)) and the classic 15 puzzle benchmark (following Korf (1990)).
Dataset Splits No The paper mentions using '100 random 15 puzzles first generated by Korf (1985)' and 'random trees' but does not specify any training, validation, or test dataset splits or cross-validation methods.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies No The paper describes implementation details such as numerical representation of beliefs but does not provide specific software or library names with version numbers.
Experiment Setup Yes For Bellman, we estimate ˆf(n) as g(n) + 0.23d(n), where d(n) is an estimate of the number of actions from n to a goal (equal to h(n) in domains with unit edge costs) and 0.23 is an estimate of h s per-step error (obtained from pilot experiments on random trees of depth 100, where the average solution cost was about 23). For Nancy and Cserna, we tried both the correct beliefs, derived via Cserna backups from [0, 1] uniform distributions at the lookahead frontier, and more general approximate beliefs represented by Gaussians. The Gaussian beliefs are centered on ˆf and, following O Ceallaigh and Ruml (2015), have variance proportional to the difference between a node s ˆf and f values: B(n) N ˆf(n), ˆf(n) f(n). The beliefs were implemented as truncated Gaussians bounded from below at the admissible f value and above at three standard deviations.