Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Suboptimality of Thompson Sampling in High Dimensions

Authors: Raymond Zhang, Richard Combes

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We complement our theoretical results with numerical results and show that in practice TS indeed can perform very poorly in some high dimensional situations.
Researcher Affiliation	Academia	Raymond Zhang ENS Paris Saclay, Département de Mathématiques, Gif-sur-Yvette, France EMAIL Richard Combes Centrale-Supelec, Laboratoire des Signaux et Systèmes (L2S), Gif-sur-Yvette, France EMAIL
Pseudocode	Yes	Figure 1: TS for combinatorial semi-bandits with Bernoulli rewards and uniform prior. Prior knowledge to the learner: combinatorial set X {0, 1}d, function f : X [0, 1]d R For t = 1, ..., T: 1. The learner computes statistics A(t) = Pt 1 s=1 Z(s) x(s) and B(t) = Pt 1 s=1 x(s) Z(s) x(s) 2. The learner draws V (t) with V1(t), ..., Vd(t) independent and Vi(t) Beta(Ai(t) + 1, Bi(t) + 1) 3. The learner chooses decision x(t) arg maxx X {f(x, V (t))} 4. The environment draws Z(t) with Z1(t), ..., Zd(t) independent and Zi(t) Ber(θi) 5. The learner observes Z(t) x(t) and receives the reward f(x(t), Z(t)) Performance metric: expected regret R(T, θ) = T(maxx X {Ef(x, Z(t))}) PT t=1 Ef(x(t), Z(t))
Open Source Code	No	The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper defines the parameters for the problem instances (e.g., 'θi = 1 if 1 i d/2 and θi = 1m otherwise') and generates data based on Bernoulli distributions, but does not refer to or provide access information for a publicly available or open dataset.
Dataset Splits	No	The paper conducts numerical experiments but does not provide specific details on training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, memory, or detailed computer specifications used for running the experiments.
Software Dependencies	No	The paper does not specify any software names with version numbers used for the implementation or experiments.
Experiment Setup	Yes	Unless speciﬁed otherwise we use 1000 independent sample paths for averaging, and 95% conﬁdence intervals are presented on the plots. Due to limited space, we solely consider the linear case, which is the most often considered in the literature. These experiments are averaged over 40 sample paths due to computational limits. On ﬁgures 6a and 6b we present the regret as a function of m for the set of paths X p, m = 0.1, T = 4.104 and the set of matchings X m, m = 0.05, T = 4.104 respectively.