Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Lilotane: A Lifted SAT-based Approach to Hierarchical Planning
Authors: Dominik Schreiber
JAIR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations confirm that Lilotane outperforms established SAT-based approaches, often by orders of magnitude, produces much smaller formulae on average, and compares favorably to other state-of-the-art HTN planners regarding robustness and plan quality. In the International Planning Competition (IPC) 2020, a preliminary version of Lilotane scored the second place. |
| Researcher Affiliation | Academia | Dominik Schreiber EMAIL Karlsruhe Institute of Technology, Kaiserstraße 12 76131 Karlsruhe, Germany |
| Pseudocode | Yes | Algorithm 1: Lilotane Planning Procedure |
| Open Source Code | Yes | Our source code is available at www.github.com/domschrei/lilotane and all experimental data is available at www.github.com/domschrei/lilotane-experimental-data. |
| Open Datasets | Yes | The IPC was based on an exceptionally large and diverse set of benchmarks for hierarchical planning... Table 3 lists averaged properties of old and new benchmarks in accordance with our complexity model... The Factories HTN domain. In Proceedings of the 2020 International Planning Competition (IPC). To appear. |
| Dataset Splits | No | The paper evaluates planning performance on problem instances from benchmarks (e.g., IPC 2020 benchmarks). It does not involve machine learning-style training, validation, or test splits of a dataset for model development. Instead, the evaluation is performed on a collection of problem instances. |
| Hardware Specification | Yes | The experiments have been conducted on a desktop PC running Ubuntu 18.04 with a quad-core Intel i7-6700 processor clocked at 3.40GHz and with 32GB of DDR4 RAM. ... The evaluations were conducted on an server with an AMD EPYC 7702P 64-Core processor (plus hyperthreading) clocked between 2.0 and 3.35 GHz with 1024 GB of DDR4 RAM, running Ubuntu 20.04. |
| Software Dependencies | No | We have implemented our approach in C++17. Our source code is available at www.github.com/domschrei/lilotane... We make use of panda PIparser (Behnke et al., 2020)... We used the Re-entrant Incremental SAT solver API (IPASIR, see Balyo, Biere, Iser, & Sinz, 2016) and link our software with a SAT solver. As was the case for Tree-REX, we found Glucose (Audemard & Simon, 2009) to empirically work best among various solvers... We use PANDA in conjunction with SAT solver Cryptominisat (Soos, Nohl, & Castelluccia, 2009)... Although several software tools are mentioned, specific version numbers (e.g., for Glucose or Cryptominisat) are not explicitly provided, only their defining publications. |
| Experiment Setup | Yes | We set a timeout of five minutes and a memory limit of 8GB. The experiments have been conducted on a desktop PC... The runs were performed sequentially... We executed up to 63 runs in parallel and set a time limit of 30 minutes and a memory limit of 8GB as in the IPC. |