Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Teaching People by Justifying Tree Search Decisions: An Empirical Study in Curling

Authors: Cleyton R. Silva, Michael Bowling, Levi H.S. Lelis

JAIR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A user study with 122 people shows that the participants who had access to the justifications produced by our system achieved much higher scores in a curling test than those who only observed the decision made by KR-UCT and those with access to the justifications of a baseline system.
Researcher Affiliation Academia Cleyton R. Silva EMAIL Departamento de Inform atica, Universidade Federal de Vi cosa Vi cosa, Brazil Michael Bowling EMAIL Levi H. S. Lelis EMAIL Department of Computing Science, University of Alberta Alberta Machine Intelligence Institute Edmonton, Canada
Pseudocode Yes Algorithm 1 CFJ Require: Game G, state s, algorithm A, number of states n. Ensure: Set of counterfactual states C of s Algorithm 2 Select-Counterfactuals Require: States S, intended state s , a set of features F, set of counterfactual states C of s .
Open Source Code No The paper does not provide an explicit statement or link to the source code for the methodology described.
Open Datasets No In our user study we use CFJ to teach strategies to curling rookies. We use a set of binary features that we manually extracted from curling manuals. The dataset for the user study is collected by the authors and not explicitly stated to be publicly available with concrete access information.
Dataset Splits Yes In total we had 122 participants, 41 for treatments Plain and CFJ, and 40 for Max. We had 77 male and 43 female participants, and 2 participants identified themselves as other .
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using the curling simulator of Yee et al. (2016) and Kernel Regression UCT (KR-UCT) but does not provide specific version numbers for any software dependencies or programming languages used for their own implementation.
Experiment Setup Yes We ran KR-UCT with 1,000 samples and it returned, for the red player, the action a depicted by the blue trajectory in the figure, where the rock should be placed at the center of house. We then estimated the probability of the player winning the match by simulating the game forward from the state-action pair (s, a) to the end of the game with a domain-specific policy for both players. We use the play-out policy from the original KR-UCT implementation (Yee et al., 2016). We performed 1,000 simulations and discovered that the red player has a 63.56% chance of winning the match.