Cornering Stationary and Restless Mixing Bandits with Remix-UCB
Authors: Julien Audiffren, Liva Ralaivola
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We provide a regret analysis for this bandit strategy; two noticeable features of Remix-UCB are that i) it reduces to the regular Improved-UCB when the ϕ-mixing coefficients are all 0, i.e. when the i.i.d scenario is recovered, and ii) when ϕ(n) = O(n α), it is able to ensure a controlled regret of order eΘ (α 2)/α log1/α T , where encodes the distance between the best arm and the best suboptimal arm, even in the case when α < 1, i.e. the case when the ϕ-mixing coefficients are not summable. |
| Researcher Affiliation | Academia | Julien Audiffren CMLA ENS Cachan, Paris Saclay University 94235 Cachan France audiffren@cmla.ens-cachan.fr Liva Ralaivola QARMA, LIF, CNRS Aix Marseille University F-13289 Marseille cedex 9, France liva.ralaivola@lif.univ-mrs.fr |
| Pseudocode | Yes | Algorithm 1 Remix-UCB, with parameter K, (αi)i=1 K, T, G defined in (11) |
| Open Source Code | No | The paper does not provide a statement about making its source code available or a link to a code repository. |
| Open Datasets | No | The paper is theoretical and does not mention using or providing access to any publicly available dataset. |
| Dataset Splits | No | The paper is theoretical and does not specify any training/validation/test dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup including hyperparameters or system-level training settings. |