An Information-Theoretic Analysis for Thompson Sampling with Many Actions
Authors: Shi Dong, Benjamin Van Roy
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We establish new bounds that depend instead on a notion of rate-distortion. Among other things, this allows us to recover through information-theoretic arguments a near-optimal bound for the linear bandit. We also offer a bound for the logistic bandit that dramatically improves on the best previously available, though this bound depends on an information-theoretic statistic that we have only been able to quantify via computation. and Theorem 1. Let {Θk}K k=1 be any partition of Θ such that for any k = 1, . . . , K and θ, θ Θk, d(θ, θ ) ϵ. Let ψ be defined as in (4) and let θ t and θt satisfy the conditions in Proposition 2. We have Bayes Regret(T; πTS) q Γ I(θ ; ψ) T + ϵ T, |
| Researcher Affiliation | Academia | Shi Dong Stanford University sdong15@stanford.edu Benjamin Van Roy Stanford University bvr@stanford.edu |
| Pseudocode | No | The paper describes theoretical concepts and proofs without including any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper is theoretical and defines problem settings (linear bandits, logistic bandits) rather than using specific, named datasets. The 'simulated information ratio values' in Figure 1 are results from a computation to support a conjecture, not data from a publicly accessible dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental validation on datasets with explicit splits. |
| Hardware Specification | No | The paper does not specify any hardware details for its theoretical analysis or the computational results presented in Figure 1. |
| Software Dependencies | No | The paper does not mention specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not provide specific experimental setup details, hyperparameters, or training configurations. |