A Formal Metareasoning Model of Concurrent Planning and Execution
Authors: Amihay Elboher, Ava Bensoussan, Erez Karpas, Wheeler Ruml, Shahaf S. Shperberg, Eyal Shimony
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 7 Empirical Evaluation Our experimental setting is inspired by movies such as Indiana Jones or Die Hard in which the hero is required to solve a puzzle before a deadline or suffer extreme consequences. As the water jugs problem from Die Hard is too easy, we use the 15-puzzle with the Manhattan distance heuristic instead. We collected data by solving 10,000 15-puzzle instances, recording the number of expansions required by A to find an optimal solution from each initial state, as well as the actual solution length. |
| Researcher Affiliation | Academia | Amihay Elboher1, Ava Bensoussan1, Erez Karpas2, Wheeler Ruml3, Shahaf S. Shperberg1, Eyal Shimony1 1Ben-Gurion University, Israel 2Technion, Israel 3University of New Hampshire, USA |
| Pseudocode | Yes | Algorithm 1: Max-LETA, Algorithm 2: K-Bounded A, Algorithm 3: reduce cope to sae2, Algorithm 4: schedule actions, Algorithm 5: Demand-Execution SAE2 Alg |
| Open Source Code | Yes | The implementation can be found in the following repository: https://github.com/amihayelboher/Co PE |
| Open Datasets | No | The paper uses the 15-puzzle problem and collects its own data by solving 10,000 instances to generate Co PE problems. While the 15-puzzle is a well-known benchmark, the paper does not provide a link or specific citation to a pre-existing publicly available dataset that was downloaded or used directly for training/evaluation purposes. The data used for experiments was generated by the authors themselves. |
| Dataset Splits | No | The paper describes running algorithms and simulating outcomes by sampling from distributions but does not specify traditional training, validation, and test dataset splits as commonly found in machine learning experiments. It evaluates the algorithms on generated Co PE instances rather than models trained on split data. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., CPU, GPU models, memory) used to run the experiments. It mentions runtime preferences for integration into temporal planners but not the actual hardware used for their empirical evaluation. |
| Software Dependencies | No | The paper mentions tools and algorithms like OPTIC, UCT, and UCB1, but does not provide specific version numbers for any software dependencies used in their implementation or experimental setup within the text. |
| Experiment Setup | Yes | In this setting, all base-level actions require the same amount of time units to be completed, denoted as dur(b); in our experiments, we considered dur(b) {1, 2, 3} (i.e. each 15-puzzle instance became three Co PE instances, differing only in the duration of the base-level action). ... Finally, to make the deadlines challenging, we used as the deadline for reaching the goal Xi = 4 h(i). ... MCTS with an exploration constant c = 2 and budgets of 10, 100 and 500 rollouts before selecting each time allocation. |