Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits

Authors: Haolin Liu, Chen-Yu Wei, Julian Zimmert

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We greatly improve these results by achieving a regret of e O( T) without a simulator, while maintaining computational efficiency when the action set in each round is small. In the special case of sleeping bandits with adversarial loss and stochastic arm availability, our result answers affirmatively the open question by Saha et al. [2020] on whether there exists a polynomial-time algorithm with poly(d) T regret. Our approach naturally handles the case where the loss is linear up to an additive misspecification error, and our regret shows near-optimal dependence on the magnitude of the error.
Researcher Affiliation Collaboration Haolin Liu University of Virginia srs8rh@virginia.edu Chen-Yu Wei University of Virginia chenyu.wei@virginia.edu Julian Zimmert Google Research zimmert@google.com
Pseudocode Yes Algorithm 1 Logdet-FTRL for linear contextual bandits. Algorithm 2 Linear EXP4.
Open Source Code No The paper does not provide any explicit statement about releasing open-source code or a link to a code repository for the described methodology.
Open Datasets No The paper is theoretical and focuses on algorithm design and regret bounds; it does not report on empirical studies using specific datasets. Therefore, no information about public datasets is provided.
Dataset Splits No The paper is theoretical and does not conduct empirical experiments with datasets. Therefore, there is no mention of training/validation/test splits.
Hardware Specification No The paper is theoretical and does not describe empirical experiments. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe empirical experiments with specific software implementations. Therefore, no software dependencies with version numbers are listed.
Experiment Setup No The paper is theoretical and focuses on algorithm design and analysis rather than empirical experimentation. Therefore, no experimental setup details such as hyperparameters or system-level training settings are provided.