Real-Time Planning as Decision-Making under Uncertainty
Authors: Andrew Mitchell, Wheeler Ruml, Fabian Spaniol, Jorg Hoffmann, Marek Petrik2338-2345
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate these methods in a simple synthetic benchmark and the sliding tile puzzle and find that they outperform previous methods. |
| Researcher Affiliation | Academia | Andrew Mitchell,1 Wheeler Ruml,1 Fabian Spaniol,2 J org Hoffmann,2 Marek Petrik1 1Department of Computer Science, University of New Hampshire, USA 2Department of Computer Science, Saarland University, Germany |
| Pseudocode | No | No explicitly labeled pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use as benchmarks both random trees (following Pemberton and Korf (1994)) and the classic 15 puzzle benchmark (following Korf (1990)). |
| Dataset Splits | No | The paper mentions using '100 random 15 puzzles first generated by Korf (1985)' and 'random trees' but does not specify any training, validation, or test dataset splits or cross-validation methods. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper describes implementation details such as numerical representation of beliefs but does not provide specific software or library names with version numbers. |
| Experiment Setup | Yes | For Bellman, we estimate ˆf(n) as g(n) + 0.23d(n), where d(n) is an estimate of the number of actions from n to a goal (equal to h(n) in domains with unit edge costs) and 0.23 is an estimate of h s per-step error (obtained from pilot experiments on random trees of depth 100, where the average solution cost was about 23). For Nancy and Cserna, we tried both the correct beliefs, derived via Cserna backups from [0, 1] uniform distributions at the lookahead frontier, and more general approximate beliefs represented by Gaussians. The Gaussian beliefs are centered on ˆf and, following O Ceallaigh and Ruml (2015), have variance proportional to the difference between a node s ˆf and f values: B(n) N ˆf(n), ˆf(n) f(n). The beliefs were implemented as truncated Gaussians bounded from below at the admissible f value and above at three standard deviations. |