Smoothed Adversarial Linear Contextual Bandits with Knapsacks
Authors: Vidyashankar Sivakumar, Shiliang Zuo, Arindam Banerjee
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We present algorithms and characterize regret for Lin CBw K in the smoothed setting where base context vectors are assumed to be perturbed by Gaussian noise. We consider both the stochastic and adversarial settings for the base contexts, and our analysis of stochastic Lin CBw K can be viewed as a warm-up to the more challenging adversarial Lin CBw K. For the stochastic setting, we obtain Op ? Tq additive regret bounds compared to the best context dependent fixed policy. Our main contribution is an algorithm with Oplog Tq competitive ratio relative to the best context dependent fixed policy for the adversarial setting. |
| Researcher Affiliation | Collaboration | 1Amazon 2Department of Computer Science, University of Illinois, Urbana-Champaign. Correspondence to: Vidyashankar Sivakumar <shankar861@gmail.com>, Shiliang Zuo <szuo3@illinois.edu>, Arindam Banerjee <arindamb@illinois.edu>. |
| Pseudocode | Yes | Algorithm 1 Greedy Lin CBw K; Algorithm 2 Greedy Step; Algorithm 3 Estimate; Algorithm 5 Smoothed Adversarial Lin CBw K |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not describe experiments run on specific datasets, thus no information about public dataset availability for training is provided. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments with data, therefore, no training/test/validation splits are mentioned. |
| Hardware Specification | No | The paper is theoretical and focuses on algorithms and analysis; it does not mention any hardware specifications used for experiments. |
| Software Dependencies | No | The paper is theoretical and describes algorithms and mathematical analysis; it does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical, focusing on algorithmic design and regret analysis. While it describes the 'Greedy Algorithm' and its theoretical properties, it does not provide concrete experimental setup details such as hyperparameter values, training configurations, or system-level settings typically found in empirical studies. |