reproducibilityindex.ai

Taming Heavy-Tailed Losses in Adversarial Bandits and the Best-of-Both-Worlds Setting

Authors: Duo Cheng, Xingyu Zhou, Bo Ji

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we study the multi-armed bandits problem in the best-of-both-worlds (BOBW) setting with heavy-tailed losses, where the losses can be negative and unbounded but have (1 + v)-th raw moments bounded by u1+v for some known u > 0 and v (0, 1]. We propose an algorithm and prove that it achieves a T 1 1+v -type worst-case (pseudo-)regret in the adversarial regime and a log T-type gap-dependent regret in the stochastic regime, where T is the time horizon.
Researcher Affiliation	Academia	Duo Cheng Virginia Tech duocheng@vt.edu Xingyu Zhou Wayne State University xingyu.zhou@wayne.edu Bo Ji Virginia Tech boji@vt.edu
Pseudocode	Yes	Algorithm 1 OMD with log-barrier and increasing learning rates for heavy-tailed MABs (OMD-LB-HT) ... Algorithm 2 SAO for heavy-tailed MABs (SAO-HT)
Open Source Code	No	The paper does not contain any statement about releasing open-source code for the described methodology or a link to a repository.
Open Datasets	No	The paper is theoretical and does not describe experiments using a dataset. It refers to 'real-world data from application domains such as finance (Cont, 2001) and imaging (Hamza & Krim, 2001)' as examples of heavy-tailed distributions, but these are not used as datasets for empirical evaluation.
Dataset Splits	No	The paper is theoretical and does not describe experiments with dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe any experiments that would require specific hardware specifications.
Software Dependencies	No	The paper is theoretical and does not mention specific software dependencies with version numbers for experimental reproducibility.
Experiment Setup	No	The paper is theoretical and defines algorithm parameters (e.g., 'initial learning rate η', 'trimming threshold Mt,i') as part of the algorithm design, rather than specific experimental setup details such as hyperparameters for empirical runs.