Cornering Stationary and Restless Mixing Bandits with Remix-UCB

Authors: Julien Audiffren, Liva Ralaivola

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide a regret analysis for this bandit strategy; two noticeable features of Remix-UCB are that i) it reduces to the regular Improved-UCB when the ϕ-mixing coefficients are all 0, i.e. when the i.i.d scenario is recovered, and ii) when ϕ(n) = O(n α), it is able to ensure a controlled regret of order eΘ (α 2)/α log1/α T , where encodes the distance between the best arm and the best suboptimal arm, even in the case when α < 1, i.e. the case when the ϕ-mixing coefficients are not summable.
Researcher Affiliation Academia Julien Audiffren CMLA ENS Cachan, Paris Saclay University 94235 Cachan France audiffren@cmla.ens-cachan.fr Liva Ralaivola QARMA, LIF, CNRS Aix Marseille University F-13289 Marseille cedex 9, France liva.ralaivola@lif.univ-mrs.fr
Pseudocode Yes Algorithm 1 Remix-UCB, with parameter K, (αi)i=1 K, T, G defined in (11)
Open Source Code No The paper does not provide a statement about making its source code available or a link to a code repository.
Open Datasets No The paper is theoretical and does not mention using or providing access to any publicly available dataset.
Dataset Splits No The paper is theoretical and does not specify any training/validation/test dataset splits.
Hardware Specification No The paper is theoretical and does not describe any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup including hyperparameters or system-level training settings.