Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis
Authors: Vidyashankar Sivakumar, Steven Wu, Arindam Banerjee
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose simple greedy algorithms for both the single- and multi-parameter (i.e., different parameter for each context) settings and provide a unified regret analysis for θ with any assumed structure. The regret bounds are expressed in terms of geometric quantities such as Gaussian widths associated with the structure of θ . We also obtain sharper regret bounds compared to earlier work for the unstructured θ setting as a consequence of our improved analysis. We show there is implicit exploration in the smoothed setting where a simple greedy algorithm works.3. Single Parameter Regret Analysis, Theorem 1 (Gaussian Contexts Regret Bounds), Theorem 2 (Smoothed Adversary Regret Bounds), 4. Multi Parameter Regret Analysis, Theorem 3 (Multi parameter Smoothed Adversary Regret Bounds). |
| Researcher Affiliation | Collaboration | 1Walmart Labs, Sunnyvale 2Department of Computer Science, University of Minnesota, Twin Cities. Correspondence to: Vidyashankar Sivakumar <sivak017@umn.edu>, Zhiwei Steven Wu <steven7woo@gmail.com>, Arindam Banerjee <banerjee@cs.umn.edu>. |
| Pseudocode | Yes | Algorithm 1 Structured Greedy (single parameter) and Algorithm 2 High-dimensional Greedy (multi parameter). |
| Open Source Code | No | The paper does not provide any information or links regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and focuses on regret analysis of linear contextual bandits. It does not mention the use of any specific public or open datasets for training or evaluation. |
| Dataset Splits | No | The paper focuses on theoretical analysis and does not perform experiments that would require explicit training, validation, or test dataset splits. |
| Hardware Specification | No | The paper focuses on theoretical analysis and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not list any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup, hyperparameters, or system-level training settings. |