Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Decision-Theoretic Model of Assistance

Authors: A. Fern, S. Natarajan, K. Judah, P. Tadepalli

JAIR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach in two game-like computer environments where human subjects perform tasks, and in a real-world domain of providing assistance during folder navigation in a computer desktop environment. The results show that in all three domains the framework results in an assistant that substantially reduces user effort with only modest computation. [...] In this section, we present the results of conducting user studies and simulations in three domains: two game-like environments and a folder predictor domain for an intelligent desktop assistant.
Researcher Affiliation Academia Alan Fern EMAIL School of EECS, Oregon State University, Corvallis, OR USA; Sriraam Natarajan EMAIL So IC, Indiana University, Bloomington, IN USA; Kshitij Judah EMAIL School of EECS, Oregon State University, Corvallis, OR USA; Prasad Tadepalli EMAIL School of EECS, Oregon State University, Corvallis, OR USA
Pseudocode Yes Table 1: Pseudo-code for Sparse Sampling in the HGMDP
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the methodology described in the paper is open-source or publicly available.
Open Datasets No As a first step, we used the data collected from their user interface and used our model to make predictions. We use the user’s response to our predic- tions to make further predictions. Also, to handle the possibility of a new folder, we consider all the folders in the folder hierarchies for each prediction. We used a mixture density to obtain the probability distribution over the folders. [...] The data set consists of a collection of requests to open a file (Open) and save a file (save As), ordered by time. Each request contains information such as, the type of request (open or save As), the current task, the destination folder, etc. The data set consists of a total of 810 open/save As requests. The paper describes the data used from user interfaces and a 'data set' but does not provide any concrete access information (link, DOI, repository, or formal citation with authors/year) for its public availability.
Dataset Splits No The paper describes performing user studies and simulations, and using a collected dataset, but does not specify any particular train/test/validation splits or other detailed splitting methodology for reproducibility. It mentions 'simulated the human users by choosing actions according to policies learned from their observed actions from the previous user study' but no specific split ratios or counts.
Hardware Specification No The paper discusses the computational cost and time required for various heuristics and sparse sampling (e.g., 'increased the average run time (last column) by an order of magnitude'), but it does not specify any concrete hardware details such as GPU models, CPU types, or memory amounts used for running the experiments.
Software Dependencies No The paper does not explicitly mention any specific software libraries, frameworks, or programming languages with their version numbers that are crucial for replicating the experiments.
Experiment Setup Yes We also conducted experiments using sparse sampling with non-zero depths. We considered depths of d = 1 and d = 2 while using sampling widths of b = 1 or b = 2. The leaves of the sparse sampling tree are evaluated using Hr which simply applies rollout to the user policy.