Regret Bounds for Batched Bandits
Authors: Hossein Esfandiari, Amin Karbasi, Abbas Mehrabian, Vahab Mirrokni7340-7348
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove bounds for their expected regrets that improve and extend the best known regret bounds of Gao, Han, Ren, and Zhou (NeurIPS 2019), for any number of batches. In this paper, we study the problem of batch policies in the context of multi-armed and linear bandits with the goal of minimizing regret, the standard benchmark for comparing performance of bandit policies. We advance the theoretical understanding of these problems by designing algorithms along with hardness results. |
| Researcher Affiliation | Collaboration | Hossein Esfandiari,1 Amin Karbasi,2 Abbas Mehrabian,3 Vahab Mirrokni1 1Google Research, New York City, New York, USA 2School of Engineering and Applied Science, Yale University, New Haven, Connecticut, USA 3Mc Gill University, Montr eal, Quebec, Canada |
| Pseudocode | Yes | Algorithm 1 Batched arm elimination for stochastic multiarmed bandits |
| Open Source Code | No | The paper does not provide any statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and focuses on algorithm design and regret bounds for bandit problems, thus it does not use or provide information about specific datasets for training. |
| Dataset Splits | No | As a theoretical paper, it does not describe experimental validation on data splits. |
| Hardware Specification | No | The paper is theoretical and does not discuss hardware specifications for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical, focusing on algorithm design and analysis rather than empirical experiments, and therefore does not provide details on experimental setup or hyperparameters. |