reproducibilityindex.ai

Fat- and Heavy-Tailed Behavior in Satisficing Planning

Authors: Eldan Cohen, J. Christopher Beck

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we study the runtime distribution of satisﬁcing planning in ensembles of random planning problems and in multiple runs of a randomized heuristic search on a single planning instance. Using common heuristic functions (such as FF) and six benchmark problem domains from the IPC, we ﬁnd a heavy-tailed behavior, similar to that found in CSP and SAT. We investigate two notions of constrainedness, often used in the modeling of planning problems, and show that the heavy-tailed behavior tends to appear in relatively relaxed problems, where the required effort is, on average, low. Finally, we show that as with randomized restarts in CSP and SAT solving, recent search enhancements that incorporate randomness in the search process can help mitigate the effect of the heavy tail.
Researcher Affiliation	Academia	Eldan Cohen, J. Christopher Beck Department of Mechanical & Industrial Engineering University of Toronto Toronto, Canada {ecohen, jcb}@mie.utoronto.ca
Pseudocode	No	The paper describes methods and algorithms textually but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any specific links or statements indicating that the source code for their methodology is publicly available.
Open Datasets	Yes	To study the effect of resource constrainedness we use the benchmark domains used in Nakhost et. al. (2012): No Mystery, Rovers, and TPP. To study the effect of goal constrainedness we use the domains Maintenance, Parking, and Freecell. No Mystery is based on the domain used in IPC 11. Rovers is based on the domain from IPC 02. TPP was used in IPC 06. Maintenance was used in IPC 14. Parking (IPC 11, IPC 14). Freecell (IPC 00, IPC 02).
Dataset Splits	No	The paper analyzes runtime distributions on ensembles of random problems and multiple runs of a randomized search, which is not a typical machine learning setup that explicitly defines train/validation/test splits with percentages or sample counts.
Hardware Specification	No	The paper does not specify any hardware details such as GPU/CPU models, memory, or specific computing environments used for the experiments.
Software Dependencies	No	The paper mentions using 'FF' heuristic and fitting a GPD model using 'R' (via a citation to Carmona 2014), but it does not provide specific version numbers for these or any other software components.
Experiment Setup	Yes	We use GBFS and conﬁgure the planner not to re-open nodes. The heuristic function being used is FF with deferred heuristic evaluation. Given a heuristic function h(x) and a parameter p 0, that represents the extent of randomization, we consider hΔp(x) to be an p-randomized version of h if for all states s: hΔp(s) = h(s) + Δh p where Δh p is a random number in the range [ p h(s), p h(s)]. Figure 2b shows the search effort distribution of 1000 runs of a randomized search with h F F Δ0.05 on the same instance for different values of C. We present the result for landmark count (Figure 2c), and provide the τ4 values for each C in Table 1. For all heuristics, we found a transition to a heavy-tailed regime as we relax the problem, however, the heavy-tailed regime occurred at a higher C value compared to h F F (for landmark count at C=5.0, compared to C=4.0 for FF). This result is also consistent with CSPs, where the effect of using a less informative heuristic is that the heavy-tailed behavior shifts to a less constrained region (Gomes et al. 2005). ϵ-GBFS expands a node selected uniformly at random from the open list with probability ϵ and the minimal h-value node with probability 1 ϵ. Figure 5 compares the effort distribution on a single instance for h F F Δ0.05 in a standard GBFS to an ϵGBFS with ϵ=0.2 and ϵ=0.5 (dotted and dashed lines, respectively).