Generalized Planning with Positive and Negative Examples
Authors: Javier Segovia-Aguas, Sergio Jiménez, Anders Jonsson9949-9956
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments This section reports the empirical performance of our approach for the synthesis and evaluation of programs for generalized planning1. All experiments are run on an Intel Core i5 2.90GHz x 4 with a memory limit of 4GB and 600 seconds of planning timeout. In order to compare with previous approaches, we use Fast Downward (Helmert 2006) in the LAMA-2011 setting (Richter, Westphal, and Helmert 2011) to synthesize and evaluate programs using the presented compilations. |
| Researcher Affiliation | Academia | Javier Segovia-Aguas,1 Sergio Jim enez,2 Anders Jonsson3 1IRI Institut de Rob otica i Inform atica Industrial, CSIC-UPC 2VRAIN Valencian Research Institute for Artificial Intelligence, Universitat Polit ecnica de Val encia 3Universitat Pompeu Fabra |
| Pseudocode | No | The paper describes methods and actions formally but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1The source code, benchmarks and scripts are in the Automated Programming Framework (Segovia-Aguas 2017) such that any experimental data in the paper can be reproduced. |
| Open Datasets | No | The paper describes generalized planning tasks like 'Green Block', 'Fibonacci', 'Gripper', 'List', 'Triangular Sum', and 'Robo Painter' and mentions using 'almost 100 random configurations with at most 5 instances that could be either positive or negative'. However, it does not provide concrete access information (link, DOI, or formal citation for the datasets themselves) to these tasks or instances, nor does it explicitly state they are publicly available datasets in the typical sense. |
| Dataset Splits | No | The paper states: 'For the synthesis of programs that solve the previous generalized planning tasks, we compare two versions of our compilation, PN-Lite and PN, with the results from some problems whose solutions where solved and reported as One Procedure in Segovia-Aguas, Jim enez, and Jonsson (2016). We use PN to denote the version with positive and negative examples that detect the three possible failures of a planning program, whereas PN-Lite is a simpler sound version that detects incomplete programs and inapplicable actions but not infinite loops. In this experiment we have run almost 100 random configurations with at most 5 instances that could be either positive or negative (where at least one is forced to be positive, see the previous section).' and 'Negative examples are useful for defining quantitative metrics that evaluate the coverage of generalized plans with respect to a test set of unseen examples.' While it mentions synthesis from instances and evaluation on a test set, it does not provide specific details on how the data is split (e.g., exact percentages, sample counts for training/validation/test, or cross-validation setup). |
| Hardware Specification | Yes | All experiments are run on an Intel Core i5 2.90GHz x 4 with a memory limit of 4GB and 600 seconds of planning timeout. |
| Software Dependencies | No | The paper mentions using 'Fast Downward (Helmert 2006) in the LAMA-2011 setting (Richter, Westphal, and Helmert 2011)', but does not provide specific version numbers for these software tools. |
| Experiment Setup | Yes | All experiments are run on an Intel Core i5 2.90GHz x 4 with a memory limit of 4GB and 600 seconds of planning timeout. |