PRP Rebooted: Advancing the State of the Art in FOND Planning

Authors: Christian Muise, Sheila A. McIlraith, J. Christopher Beck

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our objective is to understand the performance of PR2 compared to other FOND planners in terms of coverage, solution size, and solve time. We implemented PR2 on top of the Fast Downward planning system (Helmert 2006) and used very little of the released code for PRP: specifically, much of the translation and scripts were re-used, as is the case with other FOND planners (e.g., My ND and FONDSAT use the same parsing mechanism). The code, benchmarks, and detailed analysis can be found at mulab.ai/pr2 . We compare against the state of the art in FOND planning: My ND (Mattm uller et al. 2010), FONDSAT (Geffner and Geffner 2018), PRP (Muise, Mc Ilraith, and Beck 2012), and Paladinus (Pereira et al. 2022). We configured each planner to its best settings based on aggregate performance across all domains, including using a modern SAT solver for FONDSAT (improving its coverage by a fair margin). Planners were given 4Gb of memory and 60min to solve an instance, and evaluations were run on a Power Edge C6420 machine running Ubuntu with an Intel 5218 2.3GHz processor. To evaluate our planners, we collected all of the benchmarks employed for evaluation of the FOND planners listed above, representing a total of 18 domains.
Researcher Affiliation Collaboration Christian Muise1,3, Sheila A. Mc Ilraith2,3, J. Christopher Beck2 1 Queen s University, Kingston, Canada 2 University of Toronto, Toronto, Canada 3 Vector Institute for Artificial Intelligence, Toronto, Canada christian.muise@queensu.ca, sheila@cs.toronto.edu, jcb@mie.utoronto.ca
Pseudocode Yes Algorithm 1: PR2 High-Level Planner, Algorithm 2: Fixed-Point Regression, fpr, Algorithm 3: Strong Cyclic Marking
Open Source Code Yes The code, benchmarks, and detailed analysis can be found at mulab.ai/pr2 .
Open Datasets No The paper mentions collecting
Dataset Splits No The paper does not explicitly describe training/test/validation dataset splits. It mentions
Hardware Specification Yes evaluations were run on a Power Edge C6420 machine running Ubuntu with an Intel 5218 2.3GHz processor.
Software Dependencies No The paper mentions
Experiment Setup No The paper describes general experimental settings like memory and time limits, and that planners were configured to their