Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Teaching People by Justifying Tree Search Decisions: An Empirical Study in Curling

Authors: Cleyton R. Silva, Michael Bowling, Levi H.S. Lelis

JAIR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A user study with 122 people shows that the participants who had access to the justiﬁcations produced by our system achieved much higher scores in a curling test than those who only observed the decision made by KR-UCT and those with access to the justiﬁcations of a baseline system.
Researcher Affiliation	Academia	Cleyton R. Silva EMAIL Departamento de Inform atica, Universidade Federal de Vi cosa Vi cosa, Brazil Michael Bowling EMAIL Levi H. S. Lelis EMAIL Department of Computing Science, University of Alberta Alberta Machine Intelligence Institute Edmonton, Canada
Pseudocode	Yes	Algorithm 1 CFJ Require: Game G, state s, algorithm A, number of states n. Ensure: Set of counterfactual states C of s Algorithm 2 Select-Counterfactuals Require: States S, intended state s , a set of features F, set of counterfactual states C of s .
Open Source Code	No	The paper does not provide an explicit statement or link to the source code for the methodology described.
Open Datasets	No	In our user study we use CFJ to teach strategies to curling rookies. We use a set of binary features that we manually extracted from curling manuals. The dataset for the user study is collected by the authors and not explicitly stated to be publicly available with concrete access information.
Dataset Splits	Yes	In total we had 122 participants, 41 for treatments Plain and CFJ, and 40 for Max. We had 77 male and 43 female participants, and 2 participants identiﬁed themselves as other .
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions using the curling simulator of Yee et al. (2016) and Kernel Regression UCT (KR-UCT) but does not provide specific version numbers for any software dependencies or programming languages used for their own implementation.
Experiment Setup	Yes	We ran KR-UCT with 1,000 samples and it returned, for the red player, the action a depicted by the blue trajectory in the ﬁgure, where the rock should be placed at the center of house. We then estimated the probability of the player winning the match by simulating the game forward from the state-action pair (s, a) to the end of the game with a domain-speciﬁc policy for both players. We use the play-out policy from the original KR-UCT implementation (Yee et al., 2016). We performed 1,000 simulations and discovered that the red player has a 63.56% chance of winning the match.