Stochastic contextual bandits with graph feedback: from independence number to MAS number
Authors: Yuxiao Wen, Yanjun Han, Zhengyuan Zhou
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we make inroads into this inquiry by establishing a regret lower bound Ω(pβM(G)T), where M is the number of contexts, G is the feedback graph, and βM(G) is our proposed graph-theoretic quantity that characterizes the fundamental learning limit for this class of problems. Interestingly, βM(G) interpolates between α(G) (the independence number of the graph) and m(G) (the maximum acyclic subgraph (MAS) number of the graph) as the number of contexts M varies. We also provide algorithms that achieve near-optimal regret for important classes of context sequences and/or feedback graphs, such as transitively closed graphs that find applications in auctions and inventory control. |
| Researcher Affiliation | Collaboration | New York University Arena Technologies* {yuxiaowen, yanjunhan}@nyu.edu zz26@stern.nyu.edu |
| Pseudocode | Yes | Algorithm 1: Arm elimination algorithm for self-avoiding contexts Algorithm 2: Arm elimination under general contexts |
| Open Source Code | No | The paper does not provide explicit statements about the release of source code for the described methodology or links to a code repository. |
| Open Datasets | No | This is a theoretical paper and does not involve the use of datasets for empirical evaluation. |
| Dataset Splits | No | This is a theoretical paper and does not involve dataset splits for empirical evaluation. |
| Hardware Specification | No | This is a theoretical paper and does not describe hardware used for experiments. |
| Software Dependencies | No | This is a theoretical paper and does not specify software dependencies with version numbers for experimental reproduction. |
| Experiment Setup | No | This is a theoretical paper and does not include details on experimental setup or hyperparameters. |