reproducibilityindex.ai

An Information-Theoretic Analysis for Thompson Sampling with Many Actions

Authors: Shi Dong, Benjamin Van Roy

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We establish new bounds that depend instead on a notion of rate-distortion. Among other things, this allows us to recover through information-theoretic arguments a near-optimal bound for the linear bandit. We also offer a bound for the logistic bandit that dramatically improves on the best previously available, though this bound depends on an information-theoretic statistic that we have only been able to quantify via computation. and Theorem 1. Let {Θk}K k=1 be any partition of Θ such that for any k = 1, . . . , K and θ, θ Θk, d(θ, θ ) ϵ. Let ψ be deﬁned as in (4) and let θ t and θt satisfy the conditions in Proposition 2. We have Bayes Regret(T; πTS) q Γ I(θ ; ψ) T + ϵ T,
Researcher Affiliation	Academia	Shi Dong Stanford University sdong15@stanford.edu Benjamin Van Roy Stanford University bvr@stanford.edu
Pseudocode	No	The paper describes theoretical concepts and proofs without including any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets	No	The paper is theoretical and defines problem settings (linear bandits, logistic bandits) rather than using specific, named datasets. The 'simulated information ratio values' in Figure 1 are results from a computation to support a conjecture, not data from a publicly accessible dataset.
Dataset Splits	No	The paper is theoretical and does not describe experimental validation on datasets with explicit splits.
Hardware Specification	No	The paper does not specify any hardware details for its theoretical analysis or the computational results presented in Figure 1.
Software Dependencies	No	The paper does not mention specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not provide specific experimental setup details, hyperparameters, or training configurations.