Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Goal Recognition Design in Deterministic Environments

Authors: Sarah Keren, Avigdor Gal, Erez Karpas

JAIR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our empirical evaluation we instantiate a variety of independent, persistent and monotonicnd GRD models that comply with the requirements specified above and show the effect of design on WCD. GRD analysis consists of two core tasks, namely calculating WCD and minimizing it. Accordingly, we divide our evaluation into two main parts. In the first, we measure WCD in different goal recognition settings, and evaluate the methods we have suggested to calculate it. In particular, we compare our compilation-based approaches, and examine their efficiency for the different GRD settings. The second part of our analysis focuses on the design task, and evaluates WCD reduction achieved through redesign in various settings, using the different modification methods suggested in Section 5.3. We also examine the benefits of pruning using the pruned-reduce algorithm, and compare it to exhaustive-reduce.
Researcher Affiliation Academia Sarah Keren EMAIL Harvard University School of Engineering and Applied Sciences Cambridge, Massachusetts 02138, USA Avigdor Gal EMAIL Erez Karpas EMAIL Technion Israel Institute of Technology Haifa 3200003, Israel
Pseudocode Yes Algorithm 1 wcd-bfs
Open Source Code Yes A full code base and dataset together with a GRD task generator can be found at https://github.com/sarah-keren/goal-recognition-design
Open Datasets Yes Our dataset consists of four uniform cost goal recognition domains adapted from Ramirez and Geffner (2009), namely Grid-Navigation (GRID), IPC-Grid+ (GRID+), Blockwords (BLOCK), and Logistics (LOG). We also examined three uniform cost domains adapted from Pereira et al. (2017), namely Intrusion Detection (I-DET), Depots (DEP), and Campus (CAM). All benchmarks are based on PDDL domains from the deterministic track of the International Planning Competitions (IPC). ... A full code base and dataset together with a GRD task generator can be found at https://github.com/sarah-keren/goal-recognition-design
Dataset Splits No The paper describes the generation of problem instances for different observability settings (FO, NO, POD, POND) and the application of various modification methods, but it does not specify explicit training/test/validation dataset splits in the conventional sense used for machine learning models. The experiments focus on calculating and reducing WCD for these generated instances rather than on model training and evaluation using data splits.
Hardware Specification Yes Experiments were run on Intel(R) Xeon(R) CPU X5690 machines, with a time limit of 30 minutes and memory limit of 2 GB.
Software Dependencies No For the solution of the compiled planning problems, we used the Fast Downward planning system (Helmert, 2006), running A with the LM-CUT heuristic (Helmert & Domshlak, 2009) for all but the ISS domain, for which the IPDB heuristic (Haslum, Botea, Helmert, Bonet, Koenig, et al., 2007) was used. The paper mentions the Fast Downward planning system and specific heuristics, but does not provide version numbers for these software components or any other ancillary software used in the experiments.
Experiment Setup Yes We implemented the four modification methods described in Section 5.3, namely action removal (AR), action conditioning (AC), sensor placement (SP), and single-action sensor refinement (SR). ... To evaluate the effect of design on WCD, and particularly the effect of specific modification types, we examined all instances with a design budget of 4 assigned once for each modification type and once as an overall budget. The constraint function required the optimal cost to any of the goals to remain unchanged.