Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
A Unified Framework for Planning in Adversarial and Cooperative Environments
Authors: Anagha Kulkarni, Siddharth Srivastava, Subbarao Kambhampati2479-2487
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also present an empirical evaluation to show the feasibility and usefulness of our approaches using IPC domains. 5 Empirical Evaluation We now present an empirical analysis of all four approaches. |
| Researcher Affiliation | Academia | Anagha Kulkarni, Siddharth Srivastava, Subbarao Kambhampati School of Computing, Informatics, and Decision Systems Engineering Arizona State University, Tempe, AZ 85281 USA {anaghak, siddharths, rao} @ asu.edu |
| Pseudocode | Yes | Algorithm 1: Plan Computation |
| Open Source Code | No | We modified the STRIPS planner Pyperplan (Alkhazraji et al. 2016) to implement our algorithms. We used the hsa (Keyder and Geffner 2008) heuristic of Pyperplan because it gave the best results in terms of computation time. The paper does not state that their own code is open-source. |
| Open Datasets | Yes | We use three IPC domains, namely Blocksworld, Logistics and Driverlog to evaluate our approach. |
| Dataset Splits | No | No specific dataset split information (percentages, counts, or explicit standard splits for training/validation/testing) is provided. |
| Hardware Specification | Yes | We ran our experiments on 12 core Intel Xeon CPU with an E5-2643 v3@3.40GHz processor with a 64G RAM with 20 minutes time-out. |
| Software Dependencies | No | The paper mentions 'Pyperplan (Alkhazraji et al. 2016)' and 'hsa (Keyder and Geffner 2008) heuristic' but does not specify their version numbers or any other software dependencies with versions. |
| Experiment Setup | Yes | We ran the experiments with k = 3, j = 2, ℓ= 3, m = 3, dmin = 0.25 and dmax = 0.50 for all the domains. |