reproducibilityindex.ai

Thompson Sampling for Bandits with Clustered Arms

Authors: Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show, both theoretically and empirically, how exploiting a given cluster structure can signiﬁcantly improve the regret and computational cost compared to using standard Thompson sampling. and Finally, we perform an empirical evaluation showing that our algorithms perform well compared to previously proposed algorithms for bandits with clustered arms.
Researcher Affiliation	Academia	Emil Carlsson , Devdatt Dubhashi , Fredrik D. Johansson Department of Computer Science and Engineering, Chalmers University {caremil, dubhashi, fredrik.johansson}@chalmers.se
Pseudocode	Yes	Algorithm 1 TSC and Algorithm 2 HTS
Open Source Code	No	No explicit statement or link providing concrete access to the source code for the methodology described in this paper was found.
Open Datasets	No	The paper uses synthetic data and data generated 'in the same way as in [Bouneffouf et al., 2019]', but does not provide concrete access information (e.g., specific links, DOIs, or direct citations to a publicly available dataset) for the data used in its experiments.
Dataset Splits	No	The paper describes synthetic data generation and simulation time steps, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) as would be common for static datasets.
Hardware Specification	No	No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running the experiments were provided. The acknowledgements mention 'resources provided by the Swedish National Infrastructure for Computing (SNIC)' but without specifications.
Software Dependencies	No	No specific ancillary software details, such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9), were found. The paper mentions various algorithms and methods but not their software implementations with versions.
Experiment Setup	Yes	We run all algorithms with there corresponding standard parameter (v = 1 for Lin TS and Lin TSC, c = 2 for Lin UCB and Lin UCBC). and We set the reward probability of the best arm in the optimal cluster to be 0.6 and for the worst arm in the optimal cluster we set it to be 0.6 w . For the remaining A 2 arms in the optimal cluster we draw the reward probability from U(0.6 w , 0.6) for each arm.