Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Solving QNP and FOND+ with Generating, Testing and Forbidding
Authors: Zheyuan Shi, Hao Dong, Yongmei Liu
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implemented three solvers in C++ using the above three rearrangement strategies: GTF-BFF (FOND+ solver), GTF3FF (FOND+ solver) and GTF-3FN (QNP solver).1 Note that in this section, when we mention FOND+ domains or problems, we specifically refer to those that cannot be represented as QNPs. We evaluate the performance of our three solvers on QNPs in comparison with DSET [Zeng et al., 2022], FOND-ASP [Rodriguez et al., 2022], and three solvers using qnp2fond translator [Bonet and Geffner, 2020], each paired with different underlying FOND solvers for SC planning: PRP [Muise et al., 2012], (FOND-)SAT [Geffner and Geffner, 2018], and PR2 [Muise et al., 2024]. For FOND+ planning, we compare GTF-BFF and GTF-3FF against FOND-ASP. All experiments were run on an Ubuntu 20.04 Linux machine with an Intel Core i9-10980XE CPU (3.00 GHz). Each instance was allocated a maximum of 8 GB of memory and a runtime limit of 30 minutes. ... The information about the size of these domains and overall solving results are shown in Tables 1 (QNP) and 2 (FOND+). ... Figure 3 illustrates the overall coverage performance (the ratio of all instances solved) of various solvers over time for QNP/FOND+. |
| Researcher Affiliation | Academia | Zheyuan Shi, Hao Dong, Yongmei Liu Dept. of Computer Science, Sun Yat-sen University, Guangzhou 510006, China EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: SIEVE* ... Algorithm 2: FOND+ Solver. Entry Program ... Algorithm 3: FOND+ Solver. GTF |
| Open Source Code | Yes | 1Codes and data https://github.com/sysulic/GTF4FONDX. |
| Open Datasets | Yes | The QNP instances with small numbers of actions, features, and reachable states are categorized as Tiny-domains, including: blocks clear, blocks on, gripper, delivery, delivery2, q1, q2 (unsolvable), q3, gripper2 and rewards from Bonet and Geffner [2020]; Gripper1u (unsolvable) and Nest3u (unsolvable) from Zeng et al. [2022]; and 9 instances in qnp1 from Rodriguez et al. [2022]. Other existing QNP domains (each including one or more instances) include: Nests and Nests u (all instances are unsolvable) from Zeng et al. [2022]; qnp2 from Rodriguez et al. [2022]; Gripper Abs, Ferry Abs, Logistics Abs, Zenotravel Abs, Nomystery Abs, and Floortile Abs from Dong et al. [2025]. Existing FOND+ domains are all from Rodriguez et al. [2022]: qnp2-f11, qnp2-f01 (unsolvable), football and football u (unsolvable). |
| Dataset Splits | No | The paper refers to |
| Hardware Specification | Yes | All experiments were run on an Ubuntu 20.04 Linux machine with an Intel Core i9-10980XE CPU (3.00 GHz). Each instance was allocated a maximum of 8 GB of memory and a runtime limit of 30 minutes. |
| Software Dependencies | No | The paper states, "We implemented three solvers in C++" and mentions the operating system "Ubuntu 20.04 Linux." However, it does not provide specific version numbers for the C++ compiler, associated libraries, or any other critical software dependencies. |
| Experiment Setup | Yes | Each instance was allocated a maximum of 8 GB of memory and a runtime limit of 30 minutes. |