Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Authors: Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, Xiaojin Zhang
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we develop linear bandit algorithms that automatically adapt to different environments. By plugging a novel loss estimator into the optimization problem that characterizes the instanceoptimal strategy, our first algorithm not only achieves nearly instance-optimal regret in stochastic environments, but also works in corrupted environments with additional regret being the amount of corruption, while the state-of-the-art (Li et al., 2019) achieves neither instance-optimality nor the optimal dependence on the corruption amount. Moreover, by equipping this algorithm with an adversarial component and carefully-designed testings, our second algorithm additionally enjoys minimax-optimal regret in completely adversarial environments, which is the first of this kind to our knowledge. Finally, all our guarantees hold with high probability, while existing instance-optimal guarantees only hold in expectation. |
| Researcher Affiliation | Academia | 1University of Southern California 2The Chinese University of Hong Kong. |
| Pseudocode | Yes | Algorithm 1 Randomized Instance-optimal Algorithm; Algorithm 2 BOTW (Best of Three Worlds); Algorithm 3 BOTW-SE (BOTW Single Epoch) |
| Open Source Code | No | The paper does not provide any statement about releasing open-source code or links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe experiments with specific datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not describe dataset splits for validation. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not list any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper focuses on theoretical algorithms and does not describe concrete experimental setup details like hyperparameters or training configurations. |