Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Merge-and-Shrink Task Reformulation for Classical Planning
Authors: Álvaro Torralba, Silvan Sievers
IJCAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Experiments We implemented the M&S reformulation framework in Fast Downward (FD) [Helmert, 2006b], using its existing M&S framework [Sievers, 2018] and extending it with weak bisimulation as well as pruning transformations that remove dead labels and irrelevant TSs and labels. [...] 6.1 Search Space Reduction To assess the impact of our task reformulations on the reachable state space, we run uniform-cost search and evaluate the number of expansions until the last f-layer. Fig. 2 compares the FDR representation against a-ls and d-ls with bisimulation (top) and weak bisimulation (bottom) shrinking. |
| Researcher Affiliation | Academia | 1Saarland University, Saarland Informatics Campus, Germany 2University of Basel, Switzerland |
| Pseudocode | No | The paper describes algorithms but does not provide pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | Implementation: https://doi.org/10.5281/zenodo.3232878, dataset with benchmarks: https://doi.org/10.5281/zenodo.3232844. |
| Open Datasets | Yes | We use all STRIPS benchmarks from the optimal/satisficing tracks of all IPCs, two sets consisting of 1827/1816 tasks across 48 unique domains.1 Implementation: https://doi.org/10.5281/zenodo.3232878, dataset with benchmarks: https://doi.org/10.5281/zenodo.3232844. |
| Dataset Splits | No | The paper evaluates on 'STRIPS benchmarks from the optimal/satisficing tracks of all IPCs' which are pre-defined planning tasks, not a single dataset split into train/validation/test sets for model training. Therefore, explicit dataset splits are not provided. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Fast Downward (FD) [Helmert, 2006b]' and 'its existing M&S framework [Sievers, 2018]', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We impose a time limit of 900s on the reformulation process. For the overall planning, we use a limit of 3.5 Gi B and 1800s. We consider DFP (d-ls) and sb MIASM (m-ls, called dyn-MIASM originally) [Sievers et al., 2014], with a size limit of 1000 on the resulting product. |