Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
Authors: Xuheng Li, Quanquan Gu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we examine our algorithm, FGTS-VA, against baselines (including Weighted OFUL+, FGTS, and SAVE) in experiments with synthetic data. The code can be found at https://github. com/xuheng-li99/FGTS-VA. ... Figure 1: Comparison of different algorithms. Error bands are plotted over 100 runs. |
| Researcher Affiliation | Academia | Xuheng Li Department of Computer Science University of California, Los Angeles California, 90095 EMAIL Quanquan Gu Department of Computer Science University of California, Los Angeles California, 90095 EMAIL |
| Pseudocode | Yes | Algorithm 1 FGTS-VA 1: Given hyperparameter α and γ. Initialize S0 = . 2: for t = 1 to T do 3: Receive context xt. 4: Set parameters {ηs}s [t 1] and λt according to (4.2). 5: Sample ft pt( |St 1), with the posterior distribution pt(f|St 1) defined in (4.1). 6: Select at = argmaxa At ft(xt, a). 7: Observe reward rt; update St = St 1 {(xt, at, rt)}. 8: end for |
| Open Source Code | Yes | The code can be found at https://github. com/xuheng-li99/FGTS-VA. |
| Open Datasets | No | We focus on the setting of linear bandits with d = 5 and X = {x}, so we omit the context x for simplicity. The action set is At = A = { 1/ d}d, and the ground truth parameter θ is sampled from the uniform distribution on the unit sphere. We consider two noise models with heterogeneous noise magnitudes. In both cases, the noise ϵt is sampled from N(0, σ2 t ). |
| Dataset Splits | No | The paper uses synthetic data and runs experiments for |
| Hardware Specification | No | The paper does not provide specific hardware details for the experimental runs. The NeurIPS checklist mentions: "The experiments are runnable using a personal laptop within minutes." however this is not a specific hardware specification. |
| Software Dependencies | No | The paper describes algorithmic details like "Langevin dynamics" and "SGLD steps" but does not specify any software libraries or packages with version numbers used for implementation. The NeurIPS checklist notes code is on GitHub, but specific dependencies are not mentioned in the paper text. |
| Experiment Setup | Yes | In this section, we examine our algorithm, FGTS-VA, against baselines (including Weighted OFUL+, FGTS, and SAVE) in experiments with synthetic data. ... Implementation details. For FGTS-VA, in the linear bandit setting, we let the prior distribution be the Gaussian distribution N(0, Id/d). We use Langevin dynamics to sample from this distribution: ... We use K = 20 SGLD steps in our experiments, and initialize θ(0) t+1 = θ(K) t . ... We first compare FGTS-VA with c = 0.003 against Weighted OFUL+ (Zhou and Gu, 2022), SAVE (Zhao et al., 2023), and FGTS (Zhang, 2022) with results in Figure 1. ... We then perform ablation studies of the algorithm with different choices of c. It is worth noting that c is the only tunable parameter of FGTS-VA, and c = eΘ(1) for linear bandits according to Theorem 5.4. The results are shown in Figure 2. For the case of sparse noise, we observe the advantage of choosing c bounded away from 0, i.e., advantage of the feel-good exploration. For the case of dense noise, the optimal choice of c is close to 0. |