The Pareto Frontier of model selection for general Contextual Bandits

Authors: Teodor Vanislavov Marinov, Julian Zimmert

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide a Pareto frontier of upper bounds for model selection in contextual bandits with finite sized policy classes. P.2 We present matching lower bounds that shows that our upper bounds are tight, thereby resolve the motivating open problems [Foster et al., 2020b]. P.3 We present a novel impossibility result for adapting to the number of switch points under adaptive adversaries [Besbes et al., 2014]. P.4 We negatively resolve an open problem on second order bounds for full-information [Freund, 2016].
Researcher Affiliation Collaboration Teodor Marinov Google Research tvmarinov@google.com Julian Zimmert Google Research zimmert@google.com Author was at Johns Hopkins University during part of this work.
Pseudocode Yes Algorithm 1: Hedged FTRL
Open Source Code No The paper does not contain any statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper is theoretical and does not conduct experiments involving datasets, training, or public dataset access.
Dataset Splits No The paper is theoretical and does not describe experimental validation or specific dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe any experimental setup, thus no hardware specifications are provided.
Software Dependencies No The paper is theoretical and does not mention any software dependencies with specific version numbers.
Experiment Setup No The paper is theoretical and does not include details about an experimental setup, such as hyperparameters or training configurations.