reproducibilityindex.ai

An Experimental Study of Advice in Sequential Decision-Making Under Uncertainty

Authors: Florian Benavent, Bruno Zanuttini

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then report on an experimental study of the amount of advice needed for the agent to compute a good policy. Our study shows in particular that continual interaction between the user and the agent is worthwhile, and sheds light on the pros and cons of each type of advice. and Experimental Results We now report on experiments on synthetic MDPs, aimed at evaluating advice along the following dimensions:
Researcher Affiliation	Academia	Florian Benavent, Bruno Zanuttini Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France
Pseudocode	Yes	Figure 1: Master problem (top) and subproblem (bottom) for M = S, A, T, R, γ with R given by C r d.
Open Source Code	No	The paper does not provide any concrete access information (link, explicit statement) to the source code for the methodology described.
Open Datasets	No	We ﬁrst ran experiments on generic MDPs, randomly generated using the same procedure as Regan and Boutilier (2009). Precisely, we generated random MDPs with 20 to 50 states and with 2 to 4 different actions available at each state.
Dataset Splits	No	The paper describes how the MDPs were randomly generated but does not specify train, validation, or test dataset splits (percentages or counts) or reference predefined splits for reproducibility.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, cloud resources) used to conduct the experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiments.
Experiment Setup	Yes	Precisely, we generated random MDPs with 20 to 50 states and with 2 to 4 different actions available at each state. The transition function was generated by drawing, for each pair (s, a), log(\|S\| \|A\|) reachable states, and the probability of each was generated from a Gaussian. ... We ran 50 simulations with different settings, each one for 10 iterations in iterative scenarios.