Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Decision-Theoretic Model of Assistance

Authors: A. Fern, S. Natarajan, K. Judah, P. Tadepalli

JAIR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach in two game-like computer environments where human subjects perform tasks, and in a real-world domain of providing assistance during folder navigation in a computer desktop environment. The results show that in all three domains the framework results in an assistant that substantially reduces user effort with only modest computation. [...] In this section, we present the results of conducting user studies and simulations in three domains: two game-like environments and a folder predictor domain for an intelligent desktop assistant.
Researcher Affiliation	Academia	Alan Fern EMAIL School of EECS, Oregon State University, Corvallis, OR USA; Sriraam Natarajan EMAIL So IC, Indiana University, Bloomington, IN USA; Kshitij Judah EMAIL School of EECS, Oregon State University, Corvallis, OR USA; Prasad Tadepalli EMAIL School of EECS, Oregon State University, Corvallis, OR USA
Pseudocode	Yes	Table 1: Pseudo-code for Sparse Sampling in the HGMDP
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the methodology described in the paper is open-source or publicly available.
Open Datasets	No	As a ﬁrst step, we used the data collected from their user interface and used our model to make predictions. We use the user’s response to our predic- tions to make further predictions. Also, to handle the possibility of a new folder, we consider all the folders in the folder hierarchies for each prediction. We used a mixture density to obtain the probability distribution over the folders. [...] The data set consists of a collection of requests to open a ﬁle (Open) and save a ﬁle (save As), ordered by time. Each request contains information such as, the type of request (open or save As), the current task, the destination folder, etc. The data set consists of a total of 810 open/save As requests. The paper describes the data used from user interfaces and a 'data set' but does not provide any concrete access information (link, DOI, repository, or formal citation with authors/year) for its public availability.
Dataset Splits	No	The paper describes performing user studies and simulations, and using a collected dataset, but does not specify any particular train/test/validation splits or other detailed splitting methodology for reproducibility. It mentions 'simulated the human users by choosing actions according to policies learned from their observed actions from the previous user study' but no specific split ratios or counts.
Hardware Specification	No	The paper discusses the computational cost and time required for various heuristics and sparse sampling (e.g., 'increased the average run time (last column) by an order of magnitude'), but it does not specify any concrete hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies	No	The paper does not explicitly mention any specific software libraries, frameworks, or programming languages with their version numbers that are crucial for replicating the experiments.
Experiment Setup	Yes	We also conducted experiments using sparse sampling with non-zero depths. We considered depths of d = 1 and d = 2 while using sampling widths of b = 1 or b = 2. The leaves of the sparse sampling tree are evaluated using Hr which simply applies rollout to the user policy.