Bandits with Knapsacks beyond the Worst Case
Authors: Karthik Abinav Sankararaman, Aleksandrs Slivkins
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | While worst-case regret bounds for BwK are well-understood, we present three results that go beyond the worst-case perspective. First, we provide upper and lower bounds which amount to a full characterization for logarithmic, instance-dependent regret rates. Second, we consider "simple regret" in BwK, which tracks algorithm's performance in a given round, and prove that it is small in all but a few rounds. Third, we provide a general "reduction" from BwK to bandits which takes advantage of some known helpful structure, and apply this reduction to combinatorial semi-bandits, linear contextual bandits, and multinomial-logit bandits. |
| Researcher Affiliation | Industry | Karthik Abinav Sankararaman Facebook AI, Menlo Park karthikabinavs@gmail.com Aleksandrs Slivkins Microsoft Research, New York City slivkins@microsoft.com |
| Pseudocode | No | The paper describes the UcbBwK algorithm using mathematical formulations and descriptive text, but it does not provide a formal pseudocode block or a clearly labeled 'Algorithm' section with structured steps. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include links to a code repository. |
| Open Datasets | No | This is a theoretical paper that does not conduct empirical experiments or use datasets for training. |
| Dataset Splits | No | As a theoretical paper, it does not describe experimental validation on datasets, and therefore no training/validation/test dataset splits are specified. |
| Hardware Specification | No | This is a theoretical paper that does not conduct experiments requiring specific hardware, so no hardware specifications are mentioned. |
| Software Dependencies | No | This theoretical paper does not describe software implementations or dependencies with version numbers. |
| Experiment Setup | No | As a theoretical paper, there are no empirical experiments conducted, and therefore no specific experimental setup details like hyperparameters or training settings are provided. |