Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Lilotane: A Lifted SAT-based Approach to Hierarchical Planning

Authors: Dominik Schreiber

JAIR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations confirm that Lilotane outperforms established SAT-based approaches, often by orders of magnitude, produces much smaller formulae on average, and compares favorably to other state-of-the-art HTN planners regarding robustness and plan quality. In the International Planning Competition (IPC) 2020, a preliminary version of Lilotane scored the second place.
Researcher Affiliation Academia Dominik Schreiber EMAIL Karlsruhe Institute of Technology, Kaiserstraße 12 76131 Karlsruhe, Germany
Pseudocode Yes Algorithm 1: Lilotane Planning Procedure
Open Source Code Yes Our source code is available at www.github.com/domschrei/lilotane and all experimental data is available at www.github.com/domschrei/lilotane-experimental-data.
Open Datasets Yes The IPC was based on an exceptionally large and diverse set of benchmarks for hierarchical planning... Table 3 lists averaged properties of old and new benchmarks in accordance with our complexity model... The Factories HTN domain. In Proceedings of the 2020 International Planning Competition (IPC). To appear.
Dataset Splits No The paper evaluates planning performance on problem instances from benchmarks (e.g., IPC 2020 benchmarks). It does not involve machine learning-style training, validation, or test splits of a dataset for model development. Instead, the evaluation is performed on a collection of problem instances.
Hardware Specification Yes The experiments have been conducted on a desktop PC running Ubuntu 18.04 with a quad-core Intel i7-6700 processor clocked at 3.40GHz and with 32GB of DDR4 RAM. ... The evaluations were conducted on an server with an AMD EPYC 7702P 64-Core processor (plus hyperthreading) clocked between 2.0 and 3.35 GHz with 1024 GB of DDR4 RAM, running Ubuntu 20.04.
Software Dependencies No We have implemented our approach in C++17. Our source code is available at www.github.com/domschrei/lilotane... We make use of panda PIparser (Behnke et al., 2020)... We used the Re-entrant Incremental SAT solver API (IPASIR, see Balyo, Biere, Iser, & Sinz, 2016) and link our software with a SAT solver. As was the case for Tree-REX, we found Glucose (Audemard & Simon, 2009) to empirically work best among various solvers... We use PANDA in conjunction with SAT solver Cryptominisat (Soos, Nohl, & Castelluccia, 2009)... Although several software tools are mentioned, specific version numbers (e.g., for Glucose or Cryptominisat) are not explicitly provided, only their defining publications.
Experiment Setup Yes We set a timeout of five minutes and a memory limit of 8GB. The experiments have been conducted on a desktop PC... The runs were performed sequentially... We executed up to 63 runs in parallel and set a time limit of 30 minutes and a memory limit of 8GB as in the IPC.