The Pareto Frontier of model selection for general Contextual Bandits
Authors: Teodor Vanislavov Marinov, Julian Zimmert
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We provide a Pareto frontier of upper bounds for model selection in contextual bandits with finite sized policy classes. P.2 We present matching lower bounds that shows that our upper bounds are tight, thereby resolve the motivating open problems [Foster et al., 2020b]. P.3 We present a novel impossibility result for adapting to the number of switch points under adaptive adversaries [Besbes et al., 2014]. P.4 We negatively resolve an open problem on second order bounds for full-information [Freund, 2016]. |
| Researcher Affiliation | Collaboration | Teodor Marinov Google Research tvmarinov@google.com Julian Zimmert Google Research zimmert@google.com Author was at Johns Hopkins University during part of this work. |
| Pseudocode | Yes | Algorithm 1: Hedged FTRL |
| Open Source Code | No | The paper does not contain any statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments involving datasets, training, or public dataset access. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental validation or specific dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup, thus no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not mention any software dependencies with specific version numbers. |
| Experiment Setup | No | The paper is theoretical and does not include details about an experimental setup, such as hyperparameters or training configurations. |