reproducibilityindex.ai

On Weak Stubborn Sets in Classical Planning

Authors: Silvan Sievers, Martin Wehrle

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To complete the picture, we investigate the pruning power of GWSS, CSS, and GSSS within an A search also in practice. We use the state-of-the-art saturated cost partitioning (SCP) heuristic [Seipp et al., 2020] computed over pattern databases [Edelkamp, 2001], generated systematically up to pattern size 2 and via hill climbing [Haslum et al., 2007], and over Cartesian abstractions [Seipp and Helmert, 2018]. We added implementations of CSS and GWSS to the existing implementation of GSSS in the planner Fast Downward 20.06 [Helmert, 2006]. Our evaluation uses the STRIPS planning benchmarks of all optimal tracks of all International Planning Competitions from 1998 to 2018 which yields a set consisting of 1827 tasks from 65 domains. We conduct experiments on Intel Xeon Silver 4114 CPUs using Downward Lab [Seipp et al., 2017] and impose a memory limit of 3.5 Gi B and 30 minutes on every planner run. The code, benchmarks and experimental data are published online [Sievers and Wehrle, 2021a]. To assess the pruning power, we deﬁne the pruning ratio of a (completed) search using a pruning function as follows. We sum up, over all expanded states s, the number of all successors of s as nall and the number of generated successors of s (i.e., successors not pruned by the pruning function) as ngen. We deﬁne the pruning ratio as 1 ngen nall , yielding values from 0 to 1, where 0 means that no states are pruned and 1 means that all states are pruned. Figure 2 shows scatter plots comparing the pruning ratio of all three methods, highlighting domains where the difference of the pruning ratio is above 0.1.
Researcher Affiliation	Academia	Silvan Sievers1 and Martin Wehrle 1University of Basel, Switzerland silvan.sievers@unibas.ch
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code, benchmarks and experimental data are published online [Sievers and Wehrle, 2021a].
Open Datasets	Yes	Our evaluation uses the STRIPS planning benchmarks of all optimal tracks of all International Planning Competitions from 1998 to 2018 which yields a set consisting of 1827 tasks from 65 domains. The code, benchmarks and experimental data are published online [Sievers and Wehrle, 2021a].
Dataset Splits	No	The paper uses planning benchmarks and tasks, but does not specify explicit training, validation, and test dataset splits in terms of percentages or counts, or refer to standard splits within the benchmarks.
Hardware Specification	Yes	We conduct experiments on Intel Xeon Silver 4114 CPUs using Downward Lab [Seipp et al., 2017] and impose a memory limit of 3.5 Gi B and 30 minutes on every planner run.
Software Dependencies	Yes	We added implementations of CSS and GWSS to the existing implementation of GSSS in the planner Fast Downward 20.06 [Helmert, 2006].
Experiment Setup	Yes	To complete the picture, we investigate the pruning power of GWSS, CSS, and GSSS within an A search also in practice. We use the state-of-the-art saturated cost partitioning (SCP) heuristic [Seipp et al., 2020] computed over pattern databases [Edelkamp, 2001], generated systematically up to pattern size 2 and via hill climbing [Haslum et al., 2007], and over Cartesian abstractions [Seipp and Helmert, 2018]. We added implementations of CSS and GWSS to the existing implementation of GSSS in the planner Fast Downward 20.06 [Helmert, 2006]. As the comparison needs a common implementation basis, we use the operator-centric algorithm [Alkhazraji et al., 2012; Wehrle and Helmert, 2014] rather than the more efﬁcient, state-ofthe-art atom-centric algorithm as the latter is not compatible with mutex-based interference [R oger et al., 2020]. To maximally distinguish GWSS from GSSS, we always choose to include all enablers rather than all disablers to satisfy condition C4 of Def. 9.