Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Practical Approach to Discretised PDDL+ Problems by Translation to Numeric Planning

Authors: Francesco Percassi, Enrico Scala, Mauro Vallati

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental analysis shows the usefulness of the proposed translation and demonstrates the potential of the approach for improving the solvability of complex pddl+ instances. We validate the resulting formulations against a set of challenging benchmark domains, including real-world applications, and well-known planning engines, and we assess the impact of the introduced optimisations. An extensive experimental analysis is provided in Section 5.
Researcher Affiliation	Academia	Francesco Percassi EMAIL School of Computing and Engineering University of Huddersfield, UK; Enrico Scala EMAIL Dipartimento di Ingegneria dell Informazione Universit a degli Studi di Brescia, Italy; Mauro Vallati EMAIL School of Computing and Engineering University of Huddersfield, UK
Pseudocode	Yes	Algorithm 1: Algorithm for under approximating when a is Trigger-Free w.r.t. an event ε
Open Source Code	Yes	The benchmark suite and the tool for translating pddl+ instances are available at https://bit.ly/30gMyNW.
Open Datasets	Yes	We validate the resulting formulations against a set of challenging benchmark domains, including real-world applications, and well-known planning engines... We consider six benchmark domains. Three of them, Linear-Car (Lin-Car), Linear-Generator (Lin-Gen), and Solar-Rover (Rover), are well-known pddl+ benchmarks. Overtaking-Car (OT-Car) is a version of Linear Car... Baxter (Bertolucci, Capitanelli, Maratea, Mastrogiovanni, & Vallati, 2019) and Urban-Traffic-Control (UTC) (Vallati, Magazzeni, Schutter, Chrpa, & Mc Cluskey, 2016; Mc Cluskey & Vallati, 2017) are taken from real-world applications. The benchmark suite and the tool for translating pddl+ instances are available at https://bit.ly/30gMyNW.
Dataset Splits	No	The paper uses various benchmark domains with multiple problem instances (e.g., 'Rover (20)', 'Lin-Car (10)') for evaluation. However, it does not describe specific training, validation, or test dataset splits in the context of partitioning a single dataset for model training or evaluation, which is typical for machine learning experiments. The evaluation is performed by solving problems on these predefined instances.
Hardware Specification	Yes	Our experiments were run on an Intel Xeon Gold 6140M CPU with 2.30 GHz.
Software Dependencies	Yes	As a pddl2.1 planning engine we use the well-known Metric-FF (Hoffmann, 2003). We consider three engines at the state of the art for pddl+ planning: Enhsp version 20 (Scala et al., 2020), SMTPlan (Cashmore et al., 2020), Di No (Piotrowski, Fox, Long, Magazzeni, & Mercorio, 2016) and UPMurphi (Penna, Magazzeni, & Mercorio, 2012)... Our implementation of the translator is written in Python 3 and makes use of the Sym Py library (Meurer et al., 2017).
Experiment Setup	Yes	All the planning engines have been run using default parameters. For each instance, we set a cutoff time of 900 seconds, and memory was limited to 8 GB.