Optimal Best Markovian Arm Identification with Fixed Confidence

Authors: Vrettos Moulos

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We derive instance specific nonasymptotic and asymptotic lower bounds which generalize those of the IID setting. We analyze the Track-and-Stop strategy, initially proposed for the IID setting, and we prove that asymptotically it is at most a factor of four apart from the lower bound. Our one-parameter Markovian bandit model is based on the notion of an exponential family of stochastic matrices for which we establish many useful properties. For the analysis of the Track-and-Stop strategy we derive a novel concentration inequality for Markov chains that may be of interest in its own right.
Researcher Affiliation Academia Vrettos Moulos University of California Berkeley vrettos@berkeley.edu
Pseudocode No The paper does not contain structured pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code (e.g., specific repository link, explicit code release statement) for the methodology described.
Open Datasets No The paper is theoretical and does not mention using any specific datasets or providing access information for publicly available datasets for training purposes.
Dataset Splits No The paper is theoretical and does not provide specific dataset split information (percentages, sample counts, or splitting methodology) needed for reproduction, as it does not involve empirical validation.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, processor types) used for running experiments, as it is a theoretical work.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers), as it is a theoretical work without empirical implementation details.
Experiment Setup No The paper does not contain specific experimental setup details (e.g., concrete hyperparameter values, training configurations, or system-level settings), as it is a theoretical work.