Bayesian Execution Skill Estimation

Authors: Christopher Archibald, Delma Nieves-Rivera6014-6021

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Sections 5, 6, and 7 present and discuss experimental work in two domains, and section 8 concludes with discussion and future work. The success of the proposed method is demonstrated experimentally in a toy domain as well as the domain of computational billiards.
Researcher Affiliation Academia Christopher Archibald, Delma Nieves-Rivera archibald@cse.msstate.edu, din7@msstate.edu Computer Science and Engineering Mississippi State University
Pseudocode No The paper describes the Bayesian approach textually and through a network diagram (Figure 2), but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (e.g., repository link, explicit statement of code release) for the source code of the methodology described.
Open Datasets No The paper uses data generated from simulations within a 'toy domain' and 'computational billiards' using a simulator and agents, but does not provide concrete access information (link, DOI, repository, formal citation) for any publicly available or open dataset used or generated.
Dataset Splits No While the paper mentions 'validation on a subset of the data' for parameter determination, it does not provide specific details on train/validation/test splits (e.g., exact percentages, sample counts, or explicit splitting methodology) for its experiments, which seem to use an online evaluation approach.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions the 'Fast Fiz physics simulator' but does not provide specific version numbers for this or any other software dependencies, which are required for full reproducibility.
Experiment Setup Yes The OR and TBA methods both used the same 17 hypothesis execution skill levels. In each state and for each execution skill level, the optimal action was computed by convolution of the reward function with the execution noise distribution using a resolution of 0.01. The execution skill levels used were 0.025, 0.125, 0.25, 0.375, 0.5 and 0.75. β was set for the experiments with Cue Card to a value of 0.55. TBA was run with 100 hypothesis execution skill levels evenly space in the interval [0.01, 0.9]. A total of around 1,900 experiments were performed for each one of the different agents.