Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Planning for Hybrid Systems via Satisfiability Modulo Theories
Authors: Michael Cashmore, Daniele Magazzeni, Parisa Zehtabi
JAIR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we present our experimental results in four parts. |
| Researcher Affiliation | Academia | Michael Cashmore EMAIL University of Strathclyde 26 Richmond Street, G1 1XH, Glasgow, UK Daniele Magazzeni EMAIL Parisa Zehtabi EMAIL King s College London Bush House, WC2B 4BG, London, UK |
| Pseudocode | Yes | Algorithm 1: SMTPlan |
| Open Source Code | Yes | 1. The source code and documentation for SMTPlan can be obtained from https://github.com/KCL-Planning/SMTPlan |
| Open Datasets | Yes | All test domains and problems are available at: kcl-planning.github.io/SMTPlan. |
| Dataset Splits | No | The paper evaluates its planner on benchmark problems and problem instances, but the concept of training/test/validation dataset splits, as typically understood in machine learning, is not applicable to the experimental methodology of solving planning problems. The paper does not describe how to split a dataset for training and testing its planner. |
| Hardware Specification | No | The experiments were run using 8GB of RAM and a 30 minute timeout. |
| Software Dependencies | Yes | In the current implementation of SMTPlan the CAS piranha is used, as it is able to solve polynomial non-linear integrals. ... The SMT solver we use is z3 (De Moura & Bjรธrner, 2008). ... (Biscani et al. (2018, May). bluescarni/piranha: piranha 0.11.) |
| Experiment Setup | Yes | For all experiments we use an initial happening bound of 2 (assuming the goal to not hold in the intial state) a step size of 1, and a bound on the length of the event cascade of 2. The bound on the event cascade is known to be the longest cycle-free chain of events possible across all of the benchmark domains. |