reproducibilityindex.ai

Sequential Plan Recognition

Authors: Reuth Mirsky, Roni Stern, Ya’akov (Kobi) Gal, Meir Kalech

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Empirical Evaluation We evaluated the probing approaches described in the previous sections on two separate domains from the plan recognition literature. The ﬁrst is the simulated domain used by Kabanza et al. [2013]. We used their same conﬁguration which includes 100 instances with a ﬁxed number of actions, ﬁve identiﬁed goals, and a branching factor of 3 for rules in the grammar. The second domain involves students interactions with the Virtual Labs system when solving two different types of problems: the problem described in Section 2, and a problem which required students to determine the concentration level of an unknown acid solution by performing a chemical titration process. Figure 3 shows the average percentage of hypotheses remaining from the initial hypothesis set (H0) as a function of the number of queries performed. Table 2 shows the average number of queries needed until reaching the minimal set of hypotheses, for each probing strategy.
Researcher Affiliation	Academia	Reuth Mirsky, Roni Stern, Ya akov (Kobi) Gal, and Meir Kalech Department of Information Systems Engineering Ben-Gurion University of the Negev, Israel {dekelr,sternron,kobig,kalech}@bgu.ac.il
Pseudocode	Yes	Algorithm 1: Sequential Plan Recognition Process. Input: H0 is the initial set of hypotheses Input: QA is a query function Input: is a query policy 1 i 0; CLOSED ; 2 while h2Hi h \ CLOSED 6= ; or \|Hi\| = 1 do 4 Hi+1 Update(QA(p), Hi, p) 6 Add p to CLOSED
Open Source Code	No	The paper does not provide an explicit statement or link for the open-sourcing of its own methodology's code.
Open Datasets	No	The paper references a 'simulated domain used by Kabanza et al. [2013]' and describes using '35 logs of students interactions in Virtual Labs', but it does not provide concrete access information (e.g., specific URLs, repositories, or explicit statements about public availability) for these datasets.
Dataset Splits	No	The paper does not specify any training, validation, or test dataset splits (e.g., percentages or sample counts) needed to reproduce the experiment.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used to run the experiments.
Software Dependencies	No	The paper mentions using the 'PHATT algorithm [Geib and Goldman, 2009]' but does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the implementation.
Experiment Setup	Yes	We used their same conﬁguration which includes 100 instances with a ﬁxed number of actions, ﬁve identiﬁed goals, and a branching factor of 3 for rules in the grammar. For both domains, we kept the PR algorithm constant as the PHATT algorithm [Geib and Goldman, 2009] and only varied the type of query mechanism used for the SPR. For both domains we used the plan recognition output after 7 observations.