Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
A Practical Approach to Discretised PDDL+ Problems by Translation to Numeric Planning
Authors: Francesco Percassi, Enrico Scala, Mauro Vallati
JAIR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental analysis shows the usefulness of the proposed translation and demonstrates the potential of the approach for improving the solvability of complex pddl+ instances. We validate the resulting formulations against a set of challenging benchmark domains, including real-world applications, and well-known planning engines, and we assess the impact of the introduced optimisations. An extensive experimental analysis is provided in Section 5. |
| Researcher Affiliation | Academia | Francesco Percassi EMAIL School of Computing and Engineering University of Huddersfield, UK; Enrico Scala EMAIL Dipartimento di Ingegneria dell Informazione Universit a degli Studi di Brescia, Italy; Mauro Vallati EMAIL School of Computing and Engineering University of Huddersfield, UK |
| Pseudocode | Yes | Algorithm 1: Algorithm for under approximating when a is Trigger-Free w.r.t. an event ε |
| Open Source Code | Yes | The benchmark suite and the tool for translating pddl+ instances are available at https://bit.ly/30gMyNW. |
| Open Datasets | Yes | We validate the resulting formulations against a set of challenging benchmark domains, including real-world applications, and well-known planning engines... We consider six benchmark domains. Three of them, Linear-Car (Lin-Car), Linear-Generator (Lin-Gen), and Solar-Rover (Rover), are well-known pddl+ benchmarks. Overtaking-Car (OT-Car) is a version of Linear Car... Baxter (Bertolucci, Capitanelli, Maratea, Mastrogiovanni, & Vallati, 2019) and Urban-Traffic-Control (UTC) (Vallati, Magazzeni, Schutter, Chrpa, & Mc Cluskey, 2016; Mc Cluskey & Vallati, 2017) are taken from real-world applications. The benchmark suite and the tool for translating pddl+ instances are available at https://bit.ly/30gMyNW. |
| Dataset Splits | No | The paper uses various benchmark domains with multiple problem instances (e.g., 'Rover (20)', 'Lin-Car (10)') for evaluation. However, it does not describe specific training, validation, or test dataset splits in the context of partitioning a single dataset for model training or evaluation, which is typical for machine learning experiments. The evaluation is performed by solving problems on these predefined instances. |
| Hardware Specification | Yes | Our experiments were run on an Intel Xeon Gold 6140M CPU with 2.30 GHz. |
| Software Dependencies | Yes | As a pddl2.1 planning engine we use the well-known Metric-FF (Hoffmann, 2003). We consider three engines at the state of the art for pddl+ planning: Enhsp version 20 (Scala et al., 2020), SMTPlan (Cashmore et al., 2020), Di No (Piotrowski, Fox, Long, Magazzeni, & Mercorio, 2016) and UPMurphi (Penna, Magazzeni, & Mercorio, 2012)... Our implementation of the translator is written in Python 3 and makes use of the Sym Py library (Meurer et al., 2017). |
| Experiment Setup | Yes | All the planning engines have been run using default parameters. For each instance, we set a cutoff time of 900 seconds, and memory was limited to 8 GB. |