Thompson Sampling via Local Uncertainty

Authors: Zhendong Wang, Mingyuan Zhou

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results on eight contextual bandit benchmark datasets show that Thompson sampling guided by local uncertainty achieves state-of-the-art performance while having low computational complexity.
Researcher Affiliation Academia 1Mc Combs School of Business, The University of Texas at Austin, Austin, TX 78712, USA. Correspondence to: Mingyuan Zhou <mingyuan.zhou@mccombs.utexas.edu>.
Pseudocode Yes We describe two different versions of TS via LU, as will be discussed in detail, in Algorithms 2 and 3 in the Appendix, respectively.
Open Source Code Yes Python (Tensor Flow 1.14) code for both LU-Gauss and LU-SIVI is available at https://github.com/Zhendong-Wang/Thompson-Sampling-via-Local-Uncertainty
Open Datasets Yes We consider eight different datasets from this benchmark, including Mushroom, Financial, Statlog, Jester, Wheel, Covertype, Adult, and Census, which exhibit a wide variety of statistical properties. Details on these datasets are provided in Table 3.
Dataset Splits No The paper describes a sequential learning setup for contextual bandits where the agent continuously interacts with the environment and updates its estimates. It does not provide traditional train/validation/test splits with percentages or sample counts common in supervised learning.
Hardware Specification Yes We report the time cost based on an Nvidia 1080-TI GPU.
Software Dependencies Yes Python (Tensor Flow 1.14) code for both LU-Gauss and LU-SIVI is available at https://github.com/Zhendong-Wang/Thompson-Sampling-via-Local-Uncertainty
Experiment Setup Yes For both LU-Gauss and LU-SIVI, we choose the Adam optimizer with the learning rate set as 10 3.