Understanding Bandits with Graph Feedback
Authors: Houshuang Chen, zengfeng Huang, Shuai Li, Chihao Zhang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose the notions of the fractional weak domination number δ and the k-packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual the fractional vertex packing set respectively. Based on this connection, we utilize the strong duality theorem to prove a general regret upper bound O (δ log |V |) 1 3 T 2 3 and a lower bound Ω (δ /α) 1 3 T 2 3. Our main algorithmic result is: Theorem 1. There exists an algorithm such that for any weakly observable graph, any time horizon T n3 log(n)/δ 2(G), its regret satisfies R(G, T) = O (δ (G) log n) 1 3 T 2 3. |
| Researcher Affiliation | Academia | Houshuang Chen Shanghai Jiao Tong University chenhoushuang@sjtu.edu.cn Zengfeng Huang Fudan University huangzf@fudan.edu.cn Shuai Li Shanghai Jiao Tong University shuaili8@sjtu.edu.cn Chihao Zhang Shanghai Jiao Tong University chihao@sjtu.edu.cn |
| Pseudocode | Yes | Algorithm 1: Online Stochastic Mirror Descent with Exploration |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments on specific datasets, therefore it does not mention publicly available datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments with datasets, thus it does not specify validation splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper describes algorithms (OSMD, EXP3) but does not provide any specific software names or version numbers used for implementation or experiments. |
| Experiment Setup | No | The paper is theoretical and presents an algorithm (Algorithm 1) but does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings. |