Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Progression Heuristics for Planning with Probabilistic LTL Constraints

Authors: Ian Mallett, Sylvie Thiebaux, Felipe Trevizan11870-11879

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that they further widen the scalability gap between heuristic search and veriﬁcation approaches to these planning problems. [...] Section 6 gives experimental results and Section 7 concludes with related and future work.
Researcher Affiliation	Academia	Ian Mallett, Sylvie Thi ebaux, Felipe Trevizan Research School of Computer Science, The Australian National University EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes mathematical formulations and processes using equations and text, but it does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Both PPDDL and PRISM versions of the problems are available at https://gitlab.com/fwt/mo-pltl-ssps-benchmarks. This link provides access to the problem definitions/benchmarks, not the source code for the authors' implemented methodology.
Open Datasets	Yes	For evaluation, we use the Factory and Wall-e domains from (Baumgartner, Thi ebaux, and Trevizan 2018), and a new domain called Priority Search. [...] Both PPDDL and PRISM versions of the problems are available at https://gitlab.com/fwt/mo-pltl-ssps-benchmarks.
Dataset Splits	No	The paper does not describe dataset splits for training, validation, or testing in the typical machine learning sense. It evaluates planning problem solvers on defined problem instances.
Hardware Specification	Yes	The experiments were ran on an Intel i7-7700@3.6GHz using Gurobi 8.1.1 on a single thread and a 20mins and 4Gb cutoff.
Software Dependencies	Yes	The experiments were ran on an Intel i7-7700@3.6GHz using Gurobi 8.1.1 on a single thread and a 20mins and 4Gb cutoff.
Experiment Setup	Yes	The experiments were ran on an Intel i7-7700@3.6GHz using Gurobi 8.1.1 on a single thread and a 20mins and 4Gb cutoff. We used the default options of PRISM and the -lp ﬂag to use their LP approach to MO-PLTL SSPs because, without this ﬂag, PRISM was unable to solve any of our benchmarks.