Bandits with Abstention under Expert Advice

Authors: Stephen Pasteris, Alberto Rumi, Maximilian Thiessen, Shota Saito, Atsushi Miyauchi, Fabio Vitale, Mark Herbster

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We are able to leverage a wide range of inductive biases, outperforming previous approaches both theoretically and in preliminary experimental analysis. Additionally, we achieve a reduction in runtime from quadratic to almost linear in the number of contexts for the specific case of metric space contexts.
Researcher Affiliation Academia Stephen Pasteris1 Alberto Rumi2,3 Maximilian Thiessen4 Shota Saito5 Atsushi Miyauchi3 Fabio Vitale3 Mark Herbster5 1The Alan Turing Institute 2University of Milan 3CENTAI Institute 4TU Wien 5University College London
Pseudocode Yes Algorithm 1 CBA(w1, η) For t = 1, 2, . . . , T do: 1. For all i [E] receive ei t 2. For all i [E] set ct,i ei t 1 3. If ct 1 1 then: (a) Set wt wt 4. Else: (a) By interval bisection find λ > 0 such that: X i [E] ct,iwt,i exp( λct,i) = 1 (b) For all i [E] set wt,i wt,i exp( λct,i) 5. Set: st X i [E] wt,iei t 6. Draw at st 7. Receive rt,at 8. For all a [K] set: ˆrt,a 1 Ja = at K(1 rt,at)/st,at 9. For all i [E] set w(t+1),i wt,i exp(ηei t ˆrt)
Open Source Code Yes This section conducts preliminary experiments, the code is available at Git Hub3. ... 3https://github.com/albertorumi/Contextual Bandits With Abstention
Open Datasets Yes We compare our approach CBA using each of these bases on real-world and artificial graphs against the following baselines: an implementation of CONTEXTUALBANDIT from Slivkins [2011], the GABA-II algorithm proposed by Herbster et al. [2021], and an EXP3 instance for each data point. We use the following graphs for evaluation. Stochastic block model. ... Gaussian graph. ... Real-world dataset. We tested our approach on the Cora dataset [Sen et al., 2008] and the Last FM Asia dataset [Leskovec and Krevl, 2014].
Dataset Splits No The paper describes an online bandit setting where a node is chosen uniformly at random at each time step. It does not mention specific train/validation splits or their sizes as typically done in supervised learning.
Hardware Specification Yes We run our experiments with an Intel Xeon Gold 6312U processor and 256 GB of RAM ECC 3200 MHz.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup No The paper describes the algorithm and its parameters theoretically, and mentions varying settings for graph generation in experiments, but does not explicitly list specific hyperparameter values (e.g., learning rates, specific eta/w1 values for the empirical runs) or detailed training configurations for the experimental setup.