Thompson Sampling for (Combinatorial) Pure Exploration

Authors: Siwei Wang, Jun Zhu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also conduct experiments to compare the complexity of TS-Explore with existing baselines.
Researcher Affiliation Academia Siwei Wang 1 Jun Zhu 1 1Dept. of Comp. Sci. & Tech., BNRist Center, Tsinghua-Bosch Joint ML Center, Tsinghua University.
Pseudocode Yes Algorithm 1 TS-Explore
Open Source Code No The paper does not provide any specific statements or links regarding the availability of open-source code for the described methodology.
Open Datasets No The experiments use a custom-defined problem instance (Problem 5.1) rather than a publicly available dataset: 'For fixed value n, there are totally 2n base arms. For the first n base arms, their expected rewards equal 0.1, and for the last n base arms, their expected rewards equal 0.9. There are only two super arms: S1 contains the first n base arms and S2 contains the last n base arms.'
Dataset Splits No The paper describes experiments on a custom-defined synthetic problem, stating 'All the results (average complexities and their standard deviations) take an average of 100 independent runs,' but does not specify any training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers used for the experiments.
Experiment Setup Yes The paper states specific parameters used in the experiments: 'always choose q = δ in TS-Explore. All the results (average complexities and their standard deviations) take an average of 100 independent runs. We first fix δ = 10 3, and compare the complexity of the above algorithms under different n s (Fig. 1(a)). Then we fix n = 2, and compare the complexity of the above algorithms under different δ s (Fig. 1(b)).'