Bandits with Knapsacks beyond the Worst Case

Authors: Karthik Abinav Sankararaman, Aleksandrs Slivkins

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical While worst-case regret bounds for BwK are well-understood, we present three results that go beyond the worst-case perspective. First, we provide upper and lower bounds which amount to a full characterization for logarithmic, instance-dependent regret rates. Second, we consider "simple regret" in BwK, which tracks algorithm's performance in a given round, and prove that it is small in all but a few rounds. Third, we provide a general "reduction" from BwK to bandits which takes advantage of some known helpful structure, and apply this reduction to combinatorial semi-bandits, linear contextual bandits, and multinomial-logit bandits.
Researcher Affiliation Industry Karthik Abinav Sankararaman Facebook AI, Menlo Park karthikabinavs@gmail.com Aleksandrs Slivkins Microsoft Research, New York City slivkins@microsoft.com
Pseudocode No The paper describes the UcbBwK algorithm using mathematical formulations and descriptive text, but it does not provide a formal pseudocode block or a clearly labeled 'Algorithm' section with structured steps.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include links to a code repository.
Open Datasets No This is a theoretical paper that does not conduct empirical experiments or use datasets for training.
Dataset Splits No As a theoretical paper, it does not describe experimental validation on datasets, and therefore no training/validation/test dataset splits are specified.
Hardware Specification No This is a theoretical paper that does not conduct experiments requiring specific hardware, so no hardware specifications are mentioned.
Software Dependencies No This theoretical paper does not describe software implementations or dependencies with version numbers.
Experiment Setup No As a theoretical paper, there are no empirical experiments conducted, and therefore no specific experimental setup details like hyperparameters or training settings are provided.