Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hierarchical Decompositions and Termination Analysis for Generalized Planning

Authors: Siddharth Srivastava

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present theoretical as well as empirical results illustrating the scope of this new approach. Our analysis shows that this approach significantly extends the class of generalized plans that can be assessed automatically, thereby reducing barriers in the synthesis and learning of reliable generalized plans.
Researcher Affiliation	Academia	Siddharth Srivastava EMAIL School of Computing and Augmented Intelligence Arizona State University Tempe, AZ 85281 USA
Pseudocode	Yes	Algorithm 1: (Progress-Sieve) abstract policy termination test (Srivastava et al., 2015) ... Algorithm 2: Gen Sieve
Open Source Code	No	The paper states: "We developed a preliminary implementation of Gen Sieve in Python." However, it does not provide a specific link, repository, or explicit statement of code release for public access.
Open Datasets	No	The paper states: "We tested the implementation using custom generated FMPs as well as randomized, autogenerated FMPs." It refers to these as randomly generated policies for testing, but does not provide any specific access information (link, citation, repository) for a publicly available dataset.
Dataset Splits	No	The paper uses "custom generated FMPs as well as randomized, autogenerated FMPs" for its empirical evaluation. While it mentions properties of these generated FMPs (e.g., increasing numbers of nodes), it does not provide specific details on how they were split into training, validation, or test sets for reproducibility.
Hardware Specification	Yes	All experiments were carried out on a laptop with a 3.1Gh Z Quad-Core Intel Core i7 processor and 16GB of RAM.
Software Dependencies	No	The paper mentions: "We developed a preliminary implementation of Gen Sieve in Python." While it specifies the programming language Python, it does not provide any version number for Python or specific versions of any libraries, frameworks, or solvers used in the implementation.
Experiment Setup	Yes	We limited the randomly generated policies to use a small number of variables to make the analysis of termination harder as with larger numbers of variables there tend to be fewer instances of the same variable being increased and decreased in the same strongly connected component. Further analysis of the ratio of variables to control states and its relationship with difficulty of asserting termination is a promising direction in the study of termination assessment of FMPs. The runtime for this Python implementation was less than 2-3s in all of our experiments. However it can be difficult to randomly generate interesting policies that can be manually verified as terminating policies, especially with more than 5-6 control states.