Multi-armed Bandits: Competing with Optimal Sequences
Authors: Zohar S. Karnin, Oren Anava
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our main result is a regret guarantee that scales with the variation parameter of the environment, without requiring any prior knowledge about it whatsoever. By that, we also resolve an open problem posted by Gur, Zeevi and Besbes [8]. An important key component in our result is a statistical test for identifying non-stationarity in a sequence of independent random variables. This test either identiļ¬es nonstationarity or upper-bounds the absolute deviation of the corresponding sequence of mean values in terms of its total variation. This test is interesting on its own right and has the potential to be found useful in additional settings. |
| Researcher Affiliation | Industry | Oren Anava The Voleon Group Berkeley, CA oren@voleon.com Zohar Karnin Yahoo! Research New York, NY zkarnin@yahoo-inc.com |
| Pseudocode | Yes | Our algorithm is formally given in Algorithm 1, and the working scheme is visually presented in Figure 2. |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe experiments performed on any specific, publicly available dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental validation with dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not specify any hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies or versions. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training configurations. |