reproducibilityindex.ai

Landmark Generation in HTN Planning

Authors: Daniel Höller, Pascal Bercher11826-11834

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluation We evaluate our new LM generation on a widely-used HTN benchmark set. It has e.g. been used by Höller et al. (2018) and Behnke, Höller, and Biundo (2019a,b). It contains 144 problem instances from 8 domains. Experiments ran on Xeon E5-2660 v3 CPUs, 4 GB RAM and 10 min time. Landmark Generation. Task LMs are extracted by both generation procedures. Over all instances, our generation finds 13% more task LMs than MT. Besides task LMs, our approach also extracts fact and method LMs. However, we find only very few method LMs (0 to 1 per instance)5. When we compare the full sets of LMs that are found (Figure 5), we extract more than twice the number LMs over the entire instance set. Generation time is not an issue for both methods: MT LM generation needs 0.03 ms on average, we need 1.3 ms. Landmark-guided Search. Though the focus of this paper is on LM generation, we want to show that the newly found LMs bear information that helps guiding the search. We therefore integrated the generation mechanisms into the PANDA framework6 (Höller et al. 2021) and combined them with the progression search algorithm described by Höller et al. (2020, Alg. 3). We realized the following heuristics: LMC-MT Landmark count heuristic using MT LMs. Landmarks are extracted once for the initial task network. During search, reached LMs are tracked and the number of unfulfilled LMs is used as heuristic value. LMC-AND/OR Same as before, but using our LM generation (which also includes fact and method LMs). LMC-AND/OR-R As before with additional analysis checking whether all unfulfilled LMs are still reachable. Be aware that a configuration with reachability analysis is not reasonable for MT LM generation. Here, all LMs are reached by definition, there is no chance to prevent this. Have a second look at Fig. 1 and 2. After applying m2, a is not reachable anymore and the search node can be pruned. For the MT LM set {T, b}, however, pruning is not possible. Figure 1 shows the coverage of several HTN planning systems. It contains the configuration with the highest coverage for each of our LM heuristics; the Relaxed Composition heuristic (Höller et al. 2018) with FF (Hoffmann and Nebel 2001) as inner heuristic (RC FF); TDGm and TDGc heuristics (Bercher et al. 2017), and compilation-based systems.
Researcher Affiliation	Academia	Daniel Höller,1 Pascal Bercher2 1 Saarland University, Saarland Informatics Campus, Saarbrücken, Germany 2 The Australian National University, College of Engineering and Computer Science, Canberra, Australia
Pseudocode	Yes	Definition 4 (Mandatory Task Landmarks). Let P = (F, A, C, M, s0, tn I, g, δ) and tn I = (TI, I, αI) be an HTN planning problem. For a primitive task a A, we define the set of mandatory tasks as MT(a) = and for an abstract task c C it is defined as follows: (c,(T, ,α)) M A set of MT landmarks LM mt for P can be computed by: 1 LM mt S t TI{αI(t)} 2 while LM mt changes do 3 LM mt LM mt S n LM mt MT(n)
Open Source Code	Yes	PANDA is available online under panda.hierarchical-task.net
Open Datasets	Yes	We evaluate our new LM generation on a widely-used HTN benchmark set. It has e.g. been used by Höller et al. (2018) and Behnke, Höller, and Biundo (2019a,b).
Dataset Splits	No	The paper describes the use of a "widely-used HTN benchmark set" but does not specify any training, validation, or test splits, percentages, or cross-validation methodology.
Hardware Specification	Yes	Experiments ran on Xeon E5-2660 v3 CPUs, 4 GB RAM and 10 min time.
Software Dependencies	No	The paper mentions integrating their methods into the "PANDA framework6 (Höller et al. 2021)" and using a "progression search algorithm described by Höller et al. (2020, Alg. 3)". While these refer to specific software or algorithms, they do not provide version numbers for general ancillary software components like programming languages (e.g., Python), libraries (e.g., PyTorch), or solvers (e.g., CPLEX).
Experiment Setup	No	The paper describes the heuristics implemented (LMC-MT, LMC-AND/OR, LMC-AND/OR-R) and compares their performance, but it does not specify concrete experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs), optimizer settings, or model initialization details.