Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles

Authors: Dylan Foster, Alexander Rakhlin

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide the first universal and optimal reduction from contextual bandits to online regression. We characterize the minimax rates for contextual bandits with general, potentially nonparametric function classes, and show that our algorithm is minimax optimal whenever the oracle obtains the optimal rate for regression.
Researcher Affiliation Academia Dylan J. Foster 1 Alexander Rakhlin 1 1Massachusetts Institute of Technology. Correspondence to: Dylan Foster <dylanf@mit.edu>.
Pseudocode Yes Algorithm 1 Square CB
Open Source Code No The paper is theoretical and does not mention providing open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not describe any experiments involving datasets, public or otherwise.
Dataset Splits No The paper is theoretical and does not describe any dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not mention any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and does not mention specific software dependencies or version numbers.
Experiment Setup No The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training configurations.