Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
Authors: Dylan Foster, Alexander Rakhlin
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We provide the first universal and optimal reduction from contextual bandits to online regression. We characterize the minimax rates for contextual bandits with general, potentially nonparametric function classes, and show that our algorithm is minimax optimal whenever the oracle obtains the optimal rate for regression. |
| Researcher Affiliation | Academia | Dylan J. Foster 1 Alexander Rakhlin 1 1Massachusetts Institute of Technology. Correspondence to: Dylan Foster <dylanf@mit.edu>. |
| Pseudocode | Yes | Algorithm 1 Square CB |
| Open Source Code | No | The paper is theoretical and does not mention providing open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not describe any experiments involving datasets, public or otherwise. |
| Dataset Splits | No | The paper is theoretical and does not describe any dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention specific software dependencies or version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training configurations. |