reproducibilityindex.ai

Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

Authors: Young Hun Jung, Ambuj Tewari

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also present empirical results that support our theoretical ﬁndings.
Researcher Affiliation	Academia	Young Hun Jung Department of Statistics University of Michigan yhjung@umich.edu Ambuj Tewari Department of Statistics University of Michigan tewaria@umich.edu
Pseudocode	Yes	Algorithm 1 Thompson sampling in restless bandits
Open Source Code	Yes	Our code is available at https://github.com/yhjung88/Thompson Samplingin Restless Bandits
Open Datasets	No	The paper describes using Monte Carlo simulation and a uniform prior distribution over a finite support for parameters, but does not refer to a specific, named public dataset with access information. It discusses a Gilbert-Elliott channel model which is studied by Liu and Zhao [2010], but doesn't provide access to the 'dataset' itself, rather the model for simulation.
Dataset Splits	No	The paper does not explicitly mention training/test/validation splits for any dataset, as it uses simulations rather than pre-existing datasets with defined splits.
Hardware Specification	No	The paper does not specify any hardware used for running experiments.
Software Dependencies	No	The paper does not mention any specific software dependencies with version numbers.
Experiment Setup	Yes	We ﬁx L = 50 and m = 30. We use Monte Carlo simulation with size 100 or greater to approximate expectations. As each arm has two parameters, there are 2K parameters. For these, we set the prior distribution to be uniform over a ﬁnite support {0.1, 0.2, , 0.9}.