Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

Authors: Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, Xiaojin Zhang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we develop linear bandit algorithms that automatically adapt to different environments. By plugging a novel loss estimator into the optimization problem that characterizes the instanceoptimal strategy, our first algorithm not only achieves nearly instance-optimal regret in stochastic environments, but also works in corrupted environments with additional regret being the amount of corruption, while the state-of-the-art (Li et al., 2019) achieves neither instance-optimality nor the optimal dependence on the corruption amount. Moreover, by equipping this algorithm with an adversarial component and carefully-designed testings, our second algorithm additionally enjoys minimax-optimal regret in completely adversarial environments, which is the first of this kind to our knowledge. Finally, all our guarantees hold with high probability, while existing instance-optimal guarantees only hold in expectation.
Researcher Affiliation Academia 1University of Southern California 2The Chinese University of Hong Kong.
Pseudocode Yes Algorithm 1 Randomized Instance-optimal Algorithm; Algorithm 2 BOTW (Best of Three Worlds); Algorithm 3 BOTW-SE (BOTW Single Epoch)
Open Source Code No The paper does not provide any statement about releasing open-source code or links to a code repository.
Open Datasets No The paper is theoretical and does not describe experiments with specific datasets for training.
Dataset Splits No The paper is theoretical and does not describe dataset splits for validation.
Hardware Specification No The paper is theoretical and does not mention any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and does not list any specific software dependencies with version numbers.
Experiment Setup No The paper focuses on theoretical algorithms and does not describe concrete experimental setup details like hyperparameters or training configurations.