Inference for Batched Bandits
Authors: Kelly Zhang, Lucas Janson, Susan Murphy
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our simulations we compare our method to high probability confidence bounds constructed using the self-normalized martingale bound of [1]. |
| Researcher Affiliation | Academia | Kelly W. Zhang Department of Computer Science Harvard University kellywzhang@seas.harvard.edu Lucas Janson Departments of Statistics Harvard University ljanson@fas.harvard.edu Susan A. Murphy Departments of Statistics and Computer Science Harvard University samurphy@fas.harvard.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. |
| Open Datasets | No | The paper uses simulated data (e.g., "N(0, 1) rewards") and does not refer to a publicly available or open dataset with access information. |
| Dataset Splits | No | The paper describes simulation parameters (e.g., T=25, n=100, 100k Monte Carlo simulations) but does not specify explicit train/validation/test dataset splits for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | All simulations are with no margin (β1 = β0 = 0); N(0, 1) rewards; T = 25; and n = 100. For -greedy, = 0.1. We use Thompson Sampling with N(0, 1) priors, a clipping constraint of 0.05 (n) t 0.95, N(0, 1) rewards, T = 25, and known σ2. We set β1 = β0 = 0, n = 25, and a clipping constraint of 0.1 (n) t 0.9. |