Taming Heavy-Tailed Losses in Adversarial Bandits and the Best-of-Both-Worlds Setting
Authors: Duo Cheng, Xingyu Zhou, Bo Ji
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we study the multi-armed bandits problem in the best-of-both-worlds (BOBW) setting with heavy-tailed losses, where the losses can be negative and unbounded but have (1 + v)-th raw moments bounded by u1+v for some known u > 0 and v (0, 1]. We propose an algorithm and prove that it achieves a T 1 1+v -type worst-case (pseudo-)regret in the adversarial regime and a log T-type gap-dependent regret in the stochastic regime, where T is the time horizon. |
| Researcher Affiliation | Academia | Duo Cheng Virginia Tech duocheng@vt.edu Xingyu Zhou Wayne State University xingyu.zhou@wayne.edu Bo Ji Virginia Tech boji@vt.edu |
| Pseudocode | Yes | Algorithm 1 OMD with log-barrier and increasing learning rates for heavy-tailed MABs (OMD-LB-HT) ... Algorithm 2 SAO for heavy-tailed MABs (SAO-HT) |
| Open Source Code | No | The paper does not contain any statement about releasing open-source code for the described methodology or a link to a repository. |
| Open Datasets | No | The paper is theoretical and does not describe experiments using a dataset. It refers to 'real-world data from application domains such as finance (Cont, 2001) and imaging (Hamza & Krim, 2001)' as examples of heavy-tailed distributions, but these are not used as datasets for empirical evaluation. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments with dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not mention specific software dependencies with version numbers for experimental reproducibility. |
| Experiment Setup | No | The paper is theoretical and defines algorithm parameters (e.g., 'initial learning rate η', 'trimming threshold Mt,i') as part of the algorithm design, rather than specific experimental setup details such as hyperparameters for empirical runs. |