Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification

Authors: Haolin Liu, Artin Tajdini, Andrew Wagenmaker, Chen-Yu Wei

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In linear bandits, how can a learner effectively learn when facing corrupted rewards? While significant work has explored this question, a holistic understanding across different adversarial models and corruption measures is lacking, as is a full characterization of the minimax regret bounds. In this work, we compare two types of corruptions commonly considered: strong corruption, where the corruption level depends on the learner s chosen action, and weak corruption, where the corruption level does not depend on the learner s chosen action. We provide a unified framework to analyze these corruptions. For stochastic linear bandits, we fully characterize the gap between the minimax regret under strong and weak corruptions. We also initiate the study of corrupted adversarial linear bandits, obtaining upper and lower bounds with matching dependencies on the corruption level.
Researcher Affiliation Academia Haolin Liu University of Virginia srs8rh@virginia.edu Artin Tajdini University of Washington artin@cs.washington.edu Andrew Wagenmaker University of California, Berkeley ajwagen@berkeley.edu Chen-Yu Wei University of Virginia chenyu.wei@virginia.edu
Pseudocode Yes Algorithm 1: Randomized Phased Elimination (for stochastic C and C bounds) Algorithm 2: FTRL with log-determinant barrier regularizer (for adversarial C bound) Algorithm 3: Continuous exponential weights (for adversarial C bound)
Open Source Code No This paper is theoretical and does not include experiments or a statement about open-sourcing code for the described methodology. The NeurIPS Paper Checklist confirms this in sections 4 and 5, stating NA for experiments and code.
Open Datasets No This paper is theoretical and does not use or reference any specific dataset for training. The NeurIPS Paper Checklist sections 4, 5, 6, 7, 8 state NA for experimental details.
Dataset Splits No This paper is theoretical and does not involve empirical studies with dataset splits. The NeurIPS Paper Checklist sections 4, 5, 6, 7, 8 state NA for experimental details.
Hardware Specification No This paper is theoretical and does not involve experiments, thus no hardware specifications are provided. The NeurIPS Paper Checklist sections related to experimental details (4, 5, 6, 7, 8) are marked as NA.
Software Dependencies No This paper is theoretical and does not involve experiments requiring specific software dependencies with version numbers. The NeurIPS Paper Checklist sections related to experimental details (4, 5, 6, 7, 8) are marked as NA.
Experiment Setup No This paper is theoretical and does not involve experiments, thus no experimental setup details like hyperparameters or training configurations are provided. The NeurIPS Paper Checklist sections related to experimental details (4, 5, 6, 7, 8) are marked as NA.