Efficient Black-Box Planning Using Macro-Actions with Focused Effects
Authors: Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method by learning macro-actions in a variety of black-box planning domains and subsequently using them for planning. We use PDDLGym [Silver and Chitnis, 2020] to automatically construct black-box simulators from classical PDDL planning problems. Additionally, we use two domain-specific simulators (for 15-puzzle and Rubik s cube) that have a different state representation to show the generality of our approach. See the appendix for implementation details and a discussion of how we selected the various macro-learning hyperparameters (sections B and E, respectively). We select the domains to give a representative picture of how the method performs on various types of planning problems. For PDDLGym compatibility reasons, we restrict the domains to those requiring only strips and typing. For the domain-specific simulators, we select 15-puzzle and Rubik s cube in particular, because they present opposing challenges for our macro-learning approach. |
| Researcher Affiliation | Collaboration | Cameron Allen1,2 , Michael Katz2 , Tim Klinger2 , George Konidaris1 , Matthew Riemer2 and Gerald Tesauro2 1Brown University 2IBM Research |
| Pseudocode | Yes | Algorithm 1 Learn macro-actions with focused effects |
| Open Source Code | Yes | Code repository and supplementary materials for this paper are available at https://github.com/camall3n/focused-macros. |
| Open Datasets | Yes | We use PDDLGym [Silver and Chitnis, 2020] to automatically construct black-box simulators from classical PDDL planning problems. Additionally, we use two domain-specific simulators (for 15-puzzle and Rubik s cube)... On the 100 hardest Rubik s cube problems from B uchner [2018]... |
| Dataset Splits | No | The paper mentions generating 100 problem instances for evaluation but does not specify formal train/validation/test splits for a dataset in the context of model training. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using "PDDLGym" but does not specify any version numbers for PDDLGym or other software dependencies. |
| Experiment Setup | Yes | For each planning domain, we generate 100 problem instances with unique random starting states and a fixed goal condition... We search for macro-actions using best-first search (BFS) with a simulation budget of BM state transitions... We save the NM macro-actions with the lowest effect size... In Table 2, we show the average solve rate and number of generated states (i.e. simulator queries) for each domain... Again we find that focused macros substantially improve planning efficiency, likely because the heuristic still uses goal counting at its core. Surprisingly, we found that BFWS did not perform significantly better than the primitive-action GBFS baseline. |