Learning and Exploiting Progress States in Greedy Best-First Search
Authors: Patrick Ferber, Liat Cohen, Jendrik Seipp, Thomas Keller
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method using the h+ and h FF heuristics [Hoffmann and Nebel, 2001] and show that our approach successfully learns useful formulas to identify progress states. We observe a trade-off between the quality of the formulas and the time to evaluate them. Most importantly, we show that exploiting progress states is beneficial: it significantly reduces the number of expansions required to find a plan. |
| Researcher Affiliation | Academia | Patrick Ferber1,2 , Liat Cohen1 , Jendrik Seipp3 and Thomas Keller1,4 1University of Basel, Basel, Switzerland 2Saarland University, Saarland Informatics Campus, Saarbr ucken, Germany 3Link oping University, Link oping, Sweden 4University of Zurich, Zurich, Switzerland |
| Pseudocode | No | The paper describes algorithmic steps in prose but does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our benchmarks, code, experiment data, and supplemental results are available online [Ferber et al., 2022]. |
| Open Datasets | No | For each domain, we define a parameter space (e.g., a range for the number of balls in Gripper) and use it to generate small training and validation instances with PDDL task generators [Seipp et al., 2022]. No direct link or specific citation for publicly available generated datasets is provided. |
| Dataset Splits | No | For each domain, we define a parameter space (...) and use it to generate small training and validation instances with PDDL task generators [Seipp et al., 2022]. As our test sets, i.e., for the GBFS runs with and without learned formulas, we use the union of Autoscale 21.11 tasks for optimal and satisficing planning [Torralba et al., 2021]. No explicit percentage splits or absolute counts for training/validation/test on a single dataset are provided. |
| Hardware Specification | Yes | All steps are executed on a single core of an Intel Xeon Silver CPU. |
| Software Dependencies | No | The paper mentions software like Fast Downward, DLPlan, scikit-learn, Downward Lab, and Sym Py, but does not provide specific version numbers for these components. |
| Experiment Setup | Yes | For each training and validation instance and each heuristic, we generate the labeled state space with a memory limit of 3.5 Gi B and a time limit of 5 hours. We impose no limit on the complexity of the generated features but instead limit the feature generation procedure to 24 hours and 3.5 Gi B of memory. Thus, we weight all samples such that both classes have the same impact on the final formula. Furthermore, since some training instances have state spaces of significantly different sizes, we additionally weight all samples to enforce that each instance has the same impact on the final formula. |