Preference Planning for Markov Decision Processes
Authors: Meilun Li, Zhikun She, Andrea Turrini, Lijun Zhang
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We develop P4Solver, an SMT-based planner computing the preferred plan by reducing the problem to quadratic programming problem, which can be solved using SMT solvers such as Z3. We illustrate the framework by applying our approach on two selected case studies. The experimental results confirm in general the effectiveness of the approach. |
| Researcher Affiliation | Academia | Meilun Li and Zhikun She School of Mathematics and Systems Science Beihang University, China Andrea Turrini and Lijun Zhang State Key Laboratory of Computer Science Institute of Software, Chinese Academy of Sciences Beijing, China |
| Pseudocode | Yes | Algorithm 1 P4Solver Input: MDP model M, preference formula ΦP , goal formula ΦG Output: An optimal policy χ |
| Open Source Code | No | The paper states that they 'implemented the P4Solver algorithm in Scala' but does not provide a link or explicit statement about making the code open-source or publicly available. |
| Open Datasets | Yes | The results for the rail robot and an adaptation of the dinner domain (Bienvenu, Fritz, and Mc Ilraith 2011) are shown in Tables 1 and 2, respectively. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits needed to reproduce the experiment. |
| Hardware Specification | Yes | We have run our prototype on a single core of a laptop running open SUSE 13.1 on a Intel Core TM i5-4200M CPU and 8Gb of RAM. |
| Software Dependencies | No | The paper mentions using Scala, Z3, and SMT-LIB format, and running on open SUSE 13.1. However, it does not provide specific version numbers for Scala, Z3, or the SMT-LIB tools used. |
| Experiment Setup | Yes | For the robot example, we consider N = 5, 6, 7 positions and B = 2 boxes and four preference formulas that require to eventually pickup or drop the boxes... The goal has to be satisfied with probability in [0.75, 1] and the preference with probability in [1, 1]. |