Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality
Authors: Kwang-Sung Jun, Chicheng Zhang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our finite-time analysis shows that CROP (i) achieves a constant-factor asymptotic optimality and, thanks to the forced-exploration-free design, (ii) adapts to bounded regret, and (iii) its regret bound scales not with the number of arms K but with an effective number of arms Kψ that we introduce. We also discuss a problem class where CROP can be exponentially better than existing algorithms in nonasymptotic regimes. Finally, we conclude with discussions in Section 6 where we report a surprising finding that UCB can be in fact better than a clairvoyant oracle algorithm (that, of course, achieves the asymptotic optimality) in nonasymptotic regimes. |
| Researcher Affiliation | Academia | Kwang-Sung Jun The University of Arizona kjun@cs.arizona.edu Chicheng Zhang The University of Arizona chichengz@cs.arizona.edu |
| Pseudocode | Yes | Algorithm 1 CRush Optimism with Pessimism (CROP) |
| Open Source Code | No | The paper does not contain any statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and does not describe the use of any datasets for training or evaluation, nor does it provide concrete access information for a publicly available dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe any dataset splits (training, validation, test) needed for empirical reproduction. |
| Hardware Specification | No | The paper is theoretical and does not describe any hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify any software dependencies with version numbers for implementation or experimentation. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with specific hyperparameters or training settings. |