Thompson Sampling for Combinatorial Semi-Bandits
Authors: Siwei Wang, Wei Chen
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we use some experiments to show the comparison of regrets of CUCB and CTS algorithms.We further conduct empirical simulations, and show that CTS performs much better than the CUCB algorithm of (Chen et al., 2016b) and C-KL-UCB algorithm based on KL-UCB of (Garivier & Cappé, 2011) on both matroid and non-matroid CMAB problem instances. |
| Researcher Affiliation | Collaboration | 1Tsinghua University, Beijing, China 2Microsoft Research, Beijing, China. |
| Pseudocode | Yes | Algorithm 1 CTS Algorithm for CMAB |
| Open Source Code | No | The paper does not contain any statement about releasing source code, nor does it provide a link to a code repository. |
| Open Datasets | No | We first generate a random graph with M nodes, and each pair of nodes has an edge with probability p. If the resulting graph has no spanning tree, we regenerate the graph again. The mean of the distribution is randomly and uniformly chosen from [0, 1]. and We build two graphs for this experiment, the results of them are shown in Figure 2(a) and Figure 2(b). The paper describes generating synthetic data rather than using a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper describes generating synthetic data and running simulations but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models or processor types used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies or their version numbers that would be needed to replicate the experiments. |
| Experiment Setup | Yes | The results are shown in Figure 1 with the probability p = 0.6 and M = 30. and In CUCB, we choose the confidence radius to be radi(t) = q 3 log t 2Ni(t), while in CUCB-m, it is q log t 2Ni(t). In C-KL-UCB we choose f(t) = log t + 2 log log t, while in C-KL-UCB-m it is log t. |