Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Property Directed Reachability for Automated Planning
Authors: M. Suda
JAIR 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An experimental comparison to the state of the art planners ο¬nds it highly competitive, solving most problems on several domains. 5. Experiments |
| Researcher Affiliation | Academia | Martin Suda EMAIL Max-Planck-Institut f ur Informatik, Saarbr ucken, Germany Charles University, Prague, Czech Republic |
| Pseudocode | Yes | Pseudocode 1 Algorithm PDR(Ξ£, I, G, T): Input: A symbolic transition system S = (Ξ£, I, G, T) Output: A witnessing path for S or a guarantee that no path exists |
| Open Source Code | Yes | The source code of PDRplan is publicly available on our web page (Suda, 2014), which also contains all the other material relevant for reproducing the experiments. |
| Open Datasets | Yes | We tested the planners on the STRIPS15 benchmarks of the International Planning Competition (IPC, 2014). |
| Dataset Splits | No | We tested the planners on the STRIPS15 benchmarks of the International Planning Competition (IPC, 2014). We used all the available STRIPS domains except the following: ... Altogether, we collected 1561 problems in 49 domains (see Table 3 on page 304 for a detailed list). While the paper uses established competition benchmarks, it does not specify any explicit training/test/validation dataset splits used by the authors themselves for their experiments. |
| Hardware Specification | Yes | We performed the experiments on machines with 3.16 GHz Intel Xeon CPU, 16 GB RAM, running Debian 6.0. |
| Software Dependencies | Yes | It internally relies on the SAT-solver Minisat (E en & S orensson, 2003) version 2.2. |
| Experiment Setup | Yes | We used a time limit of 180 seconds per problem instance for most of the runs, but increased it to 1800 seconds for the main comparison. |