Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

BDD Ordering Heuristics for Classical Planning

Authors: P. Kissmann, J. Hoffmann

JAIR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on a wide range of variable ordering variants corroborate our theoretical ﬁndings. Furthermore, we show that dynamic reordering is much more effective at reducing BDD size, but it is not cost-effective due to a prohibitive runtime overhead. We exhibit the potential of middle-ground techniques, running dynamic reordering until simple stopping criteria hold.
Researcher Affiliation	Academia	Peter Kissmann EMAIL J org Hoffmann EMAIL Saarland University, Saarbr ucken, Germany
Pseudocode	No	The paper describes algorithms and methods verbally, and uses figures to illustrate concepts (e.g., BDD examples, DTGs, Causal Graphs), but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured, code-like formatting.
Open Source Code	No	The paper mentions using "Gamer as the base implementation for all planners" and CUDD, which are third-party tools. It does not provide any specific links or explicit statements about releasing source code for the methodology or specific implementations described in this paper.
Open Datasets	Yes	We ran the benchmarks of the 2011 International Planning Competition (IPC 11)
Dataset Splits	No	The paper mentions using "the benchmarks of the 2011 International Planning Competition (IPC 11)" and refers to tasks within these benchmarks (e.g., "Visit All, task 011", "Peg Sol, task 015"). However, it does not specify any training/test/validation splits for these benchmark tasks, as the IPC benchmarks usually consist of a set of problem instances for evaluation rather than requiring a train/test split for model development.
Hardware Specification	Yes	We ran the benchmarks of the 2011 International Planning Competition (IPC 11), and we used Gamer as the base implementation for all planners, running them on one core of an Intel Xeon X5690 CPU with 3.47 GHz.
Software Dependencies	No	The paper mentions "Gamer" (a symbolic search planner) and "CUDD" (a BDD package) as key software components used. However, it does not specify any version numbers for these software dependencies, which would be necessary for reproducibility.
Experiment Setup	Yes	Unless otherwise stated, we used the IPC 11 settings, namely a timeout of 30 minutes and a memory limit of 6 GB. ... we used a time-out of one minute ... Reordering is automatically started when the number of allocated BDD nodes reaches a certain threshold (by default, the ﬁrst threshold is 4000 nodes), which is dynamically adapted after each reordering (by default, the next threshold is set to 2 times the number of nodes after reordering).