Thompson Sampling via Local Uncertainty
Authors: Zhendong Wang, Mingyuan Zhou
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results on eight contextual bandit benchmark datasets show that Thompson sampling guided by local uncertainty achieves state-of-the-art performance while having low computational complexity. |
| Researcher Affiliation | Academia | 1Mc Combs School of Business, The University of Texas at Austin, Austin, TX 78712, USA. Correspondence to: Mingyuan Zhou <mingyuan.zhou@mccombs.utexas.edu>. |
| Pseudocode | Yes | We describe two different versions of TS via LU, as will be discussed in detail, in Algorithms 2 and 3 in the Appendix, respectively. |
| Open Source Code | Yes | Python (Tensor Flow 1.14) code for both LU-Gauss and LU-SIVI is available at https://github.com/Zhendong-Wang/Thompson-Sampling-via-Local-Uncertainty |
| Open Datasets | Yes | We consider eight different datasets from this benchmark, including Mushroom, Financial, Statlog, Jester, Wheel, Covertype, Adult, and Census, which exhibit a wide variety of statistical properties. Details on these datasets are provided in Table 3. |
| Dataset Splits | No | The paper describes a sequential learning setup for contextual bandits where the agent continuously interacts with the environment and updates its estimates. It does not provide traditional train/validation/test splits with percentages or sample counts common in supervised learning. |
| Hardware Specification | Yes | We report the time cost based on an Nvidia 1080-TI GPU. |
| Software Dependencies | Yes | Python (Tensor Flow 1.14) code for both LU-Gauss and LU-SIVI is available at https://github.com/Zhendong-Wang/Thompson-Sampling-via-Local-Uncertainty |
| Experiment Setup | Yes | For both LU-Gauss and LU-SIVI, we choose the Adam optimizer with the learning rate set as 10 3. |