Metareasoning for Planning Under Uncertainty
Authors: Christopher H. Lin, Andrey Kolobov, Ece Kamar, Eric Horvitz
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a set of experiments showing the performance of these algorithms versus baselines in several synthetic domains with different properties, and characterize their performance with a measure that we call the metareasoning gap a measure of the potential for improvement from metareasoning. The experiments demonstrate that the proposed techniques excel when the metareasoning gap is large. |
| Researcher Affiliation | Collaboration | Christopher H. Lin Andrey Kolobov, Ece Kamar, Eric Horvitz University of Washington Microsoft Research Seattle, WA Redmond, WA chrislin@cs.washington.edu {akolobov,eckamar,horvitz}@microsoft.com |
| Pseudocode | No | The paper describes algorithms in text but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about providing open-source code for the methodology, nor does it provide any links to a code repository. |
| Open Datasets | No | The paper describes using 'synthetic domains' and a '100x100 grid world' built by the authors, but it does not provide concrete access information (link, DOI, citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We set the parameters of the domain as follows so that there is a policy that can get the agent to the goal with a small number of steps (in tens instead of hundreds) and where the winds significantly influence the number of steps needed to get to the goal: The agent can move 11 cells at a time and the wind has a pushing power of 10 cells. We vary the cost of thinking and acting between 1 and 15. Each BRTDP trajectory is 50 actions long. |