Finite Continuum-Armed Bandits

Authors: Solenne Gaucher

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical we propose an optimal strategy for this problem. Under natural assumptions on the reward function, we prove that the optimal regret scales as O(T 1/3) up to poly-logarithmic factors when the budget T is proportional to the number of actions N. When T becomes small compared to N, a smooth transition occurs. When the ratio T/N decreases from a constant to N 1/3, the regret increases progressively up to the O(T 1/2) rate encountered in continuum-armed bandits.
Researcher Affiliation Academia Solenne Gaucher Laboratoire de Mathématiques d Orsay Université Paris-Saclay, 91405, Orsay, France solenne.gaucher@math.u-psud.fr
Pseudocode Yes Algorithm 1 Upper Confidence Bound for Finite continuum-armed bandits (UCBF)
Open Source Code No Not found. The paper is theoretical and describes an algorithm but does not mention providing access to its source code.
Open Datasets No Not found. The paper is theoretical and does not involve the use of datasets for training or evaluation.
Dataset Splits No Not found. The paper is theoretical and does not involve experimental validation on datasets, thus no dataset splits are mentioned.
Hardware Specification No Not found. The paper is purely theoretical and does not describe any computational experiments that would require hardware specifications.
Software Dependencies No Not found. The paper is purely theoretical and does not describe any computational experiments that would require software dependencies with version numbers.
Experiment Setup No Not found. The paper is purely theoretical and does not describe any experimental setup or hyperparameters.