Understanding Bandits with Graph Feedback

Authors: Houshuang Chen, zengfeng Huang, Shuai Li, Chihao Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We propose the notions of the fractional weak domination number δ and the k-packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual the fractional vertex packing set respectively. Based on this connection, we utilize the strong duality theorem to prove a general regret upper bound O (δ log |V |) 1 3 T 2 3 and a lower bound Ω (δ /α) 1 3 T 2 3. Our main algorithmic result is: Theorem 1. There exists an algorithm such that for any weakly observable graph, any time horizon T n3 log(n)/δ 2(G), its regret satisfies R(G, T) = O (δ (G) log n) 1 3 T 2 3.
Researcher Affiliation Academia Houshuang Chen Shanghai Jiao Tong University chenhoushuang@sjtu.edu.cn Zengfeng Huang Fudan University huangzf@fudan.edu.cn Shuai Li Shanghai Jiao Tong University shuaili8@sjtu.edu.cn Chihao Zhang Shanghai Jiao Tong University chihao@sjtu.edu.cn
Pseudocode Yes Algorithm 1: Online Stochastic Mirror Descent with Exploration
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets No The paper is theoretical and does not conduct experiments on specific datasets, therefore it does not mention publicly available datasets for training.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with datasets, thus it does not specify validation splits.
Hardware Specification No The paper is theoretical and does not describe any experiments that would require specific hardware, therefore no hardware specifications are mentioned.
Software Dependencies No The paper describes algorithms (OSMD, EXP3) but does not provide any specific software names or version numbers used for implementation or experiments.
Experiment Setup No The paper is theoretical and presents an algorithm (Algorithm 1) but does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings.