Linear Contextual Bandits with Knapsacks

Authors: Shipra Agrawal, Nikhil Devanur

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present algorithms with near-optimal regret bounds for this problem. Our bounds compare favorably to results on the unstructured version of the problem [5, 10] where the relation between the contexts and the outcomes could be arbitrary, but the algorithm only competes against a fixed set of policies accessible through an optimization oracle. We combine techniques from the work on lin Contextual, Bw K and OSPP in a nontrivial manner while also tackling new difficulties that are not present in any of these special cases.
Researcher Affiliation Collaboration Columbia University. sa3305@columbia.edu. Microsoft Research. nikdev@microsoft.com.
Pseudocode Yes Algorithm 1 Algorithm for lin CBw K, with given Z ... Algorithm 2 Algorithm for lin CBw K, with Z computation
Open Source Code No The paper does not contain any statement or link indicating that source code for the described methodology is publicly available.
Open Datasets No The paper is theoretical and focuses on algorithm design and regret bounds. It does not describe experiments using specific datasets for training.
Dataset Splits No The paper is theoretical and does not describe experiments with dataset splits for validation.
Hardware Specification No The paper is theoretical and does not describe any experimental setup or the hardware used for computations.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup, including hyperparameters or training configurations.