Thompson Sampling for Bandits with Clustered Arms
Authors: Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show, both theoretically and empirically, how exploiting a given cluster structure can significantly improve the regret and computational cost compared to using standard Thompson sampling. and Finally, we perform an empirical evaluation showing that our algorithms perform well compared to previously proposed algorithms for bandits with clustered arms. |
| Researcher Affiliation | Academia | Emil Carlsson , Devdatt Dubhashi , Fredrik D. Johansson Department of Computer Science and Engineering, Chalmers University {caremil, dubhashi, fredrik.johansson}@chalmers.se |
| Pseudocode | Yes | Algorithm 1 TSC and Algorithm 2 HTS |
| Open Source Code | No | No explicit statement or link providing concrete access to the source code for the methodology described in this paper was found. |
| Open Datasets | No | The paper uses synthetic data and data generated 'in the same way as in [Bouneffouf et al., 2019]', but does not provide concrete access information (e.g., specific links, DOIs, or direct citations to a publicly available dataset) for the data used in its experiments. |
| Dataset Splits | No | The paper describes synthetic data generation and simulation time steps, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) as would be common for static datasets. |
| Hardware Specification | No | No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running the experiments were provided. The acknowledgements mention 'resources provided by the Swedish National Infrastructure for Computing (SNIC)' but without specifications. |
| Software Dependencies | No | No specific ancillary software details, such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9), were found. The paper mentions various algorithms and methods but not their software implementations with versions. |
| Experiment Setup | Yes | We run all algorithms with there corresponding standard parameter (v = 1 for Lin TS and Lin TSC, c = 2 for Lin UCB and Lin UCBC). and We set the reward probability of the best arm in the optimal cluster to be 0.6 and for the worst arm in the optimal cluster we set it to be 0.6 w . For the remaining A 2 arms in the optimal cluster we draw the reward probability from U(0.6 w , 0.6) for each arm. |