reproducibilityindex.ai

Near-optimal Reinforcement Learning in Factored MDPs

Authors: Ian Osband, Benjamin Van Roy

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our focus in this paper is upon the statistical aspect of the learning problem and like earlier discussions we do not specify which computational methods are used. Our results serve as a reduction of the reinforcement learning problem to ﬁnding an approximate solution for a given FMDP.
Researcher Affiliation	Academia	Ian Osband Stanford University iosband@stanford.edu Benjamin Van Roy Stanford University bvr@stanford.edu
Pseudocode	Yes	Algorithm 1 PSRL (Posterior Sampling) and Algorithm 2 UCRL-Factored (Optimism)
Open Source Code	No	Our focus in this paper is upon the statistical aspect of the learning problem and like earlier discussions we do not specify which computational methods are used.
Open Datasets	No	This paper is theoretical and does not describe experiments involving specific datasets for training or evaluation.
Dataset Splits	No	The paper is theoretical and does not describe any dataset splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any hardware specifications used for experiments.
Software Dependencies	No	The paper is theoretical and does not mention specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any experimental setup details or hyperparameters.