Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Hierarchical Decompositions and Termination Analysis for Generalized Planning

Authors: Siddharth Srivastava

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present theoretical as well as empirical results illustrating the scope of this new approach. Our analysis shows that this approach significantly extends the class of generalized plans that can be assessed automatically, thereby reducing barriers in the synthesis and learning of reliable generalized plans.
Researcher Affiliation Academia Siddharth Srivastava EMAIL School of Computing and Augmented Intelligence Arizona State University Tempe, AZ 85281 USA
Pseudocode Yes Algorithm 1: (Progress-Sieve) abstract policy termination test (Srivastava et al., 2015) ... Algorithm 2: Gen Sieve
Open Source Code No The paper states: "We developed a preliminary implementation of Gen Sieve in Python." However, it does not provide a specific link, repository, or explicit statement of code release for public access.
Open Datasets No The paper states: "We tested the implementation using custom generated FMPs as well as randomized, autogenerated FMPs." It refers to these as randomly generated policies for testing, but does not provide any specific access information (link, citation, repository) for a publicly available dataset.
Dataset Splits No The paper uses "custom generated FMPs as well as randomized, autogenerated FMPs" for its empirical evaluation. While it mentions properties of these generated FMPs (e.g., increasing numbers of nodes), it does not provide specific details on how they were split into training, validation, or test sets for reproducibility.
Hardware Specification Yes All experiments were carried out on a laptop with a 3.1Gh Z Quad-Core Intel Core i7 processor and 16GB of RAM.
Software Dependencies No The paper mentions: "We developed a preliminary implementation of Gen Sieve in Python." While it specifies the programming language Python, it does not provide any version number for Python or specific versions of any libraries, frameworks, or solvers used in the implementation.
Experiment Setup Yes We limited the randomly generated policies to use a small number of variables to make the analysis of termination harder as with larger numbers of variables there tend to be fewer instances of the same variable being increased and decreased in the same strongly connected component. Further analysis of the ratio of variables to control states and its relationship with difficulty of asserting termination is a promising direction in the study of termination assessment of FMPs. The runtime for this Python implementation was less than 2-3s in all of our experiments. However it can be difficult to randomly generate interesting policies that can be manually verified as terminating policies, especially with more than 5-6 control states.