Thompson Sampling for Combinatorial Semi-Bandits

Authors: Siwei Wang, Wei Chen

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we use some experiments to show the comparison of regrets of CUCB and CTS algorithms.We further conduct empirical simulations, and show that CTS performs much better than the CUCB algorithm of (Chen et al., 2016b) and C-KL-UCB algorithm based on KL-UCB of (Garivier & Cappé, 2011) on both matroid and non-matroid CMAB problem instances.
Researcher Affiliation Collaboration 1Tsinghua University, Beijing, China 2Microsoft Research, Beijing, China.
Pseudocode Yes Algorithm 1 CTS Algorithm for CMAB
Open Source Code No The paper does not contain any statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets No We first generate a random graph with M nodes, and each pair of nodes has an edge with probability p. If the resulting graph has no spanning tree, we regenerate the graph again. The mean of the distribution is randomly and uniformly chosen from [0, 1]. and We build two graphs for this experiment, the results of them are shown in Figure 2(a) and Figure 2(b). The paper describes generating synthetic data rather than using a publicly available dataset with concrete access information.
Dataset Splits No The paper describes generating synthetic data and running simulations but does not specify any training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models or processor types used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies or their version numbers that would be needed to replicate the experiments.
Experiment Setup Yes The results are shown in Figure 1 with the probability p = 0.6 and M = 30. and In CUCB, we choose the confidence radius to be radi(t) = q 3 log t 2Ni(t), while in CUCB-m, it is q log t 2Ni(t). In C-KL-UCB we choose f(t) = log t + 2 log log t, while in C-KL-UCB-m it is log t.