reproducibilityindex.ai

Improved Bayes Regret Bounds for Multi-Task Hierarchical Bayesian Bandit Algorithms

Authors: Jiechao Guan, Hui Xiong

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct experiments in the linear bandit setting to verify our theoretical results. Speciﬁcally, we show the inﬂuence of hyper-parameters (e.g. m, n, L) to the multi-task Bayes regret of Hier TS and Hier Bayes UCB, to validate the consistency between their regret bounds and practical performance. Besides, we compare the performance between our algorithms and other baselines, to show the effectiveness of hierarchical Bayesian bandit algorithms in the multi-task bandit setting. (...) Experimental Results. From Figure 1, we can observe that: (1) In plot (a), the multi-task regret becomes larger with the increase of m and n, which is consistent with our regret upper bound in Theorems 5.1.
Researcher Affiliation	Academia	Jiechao Guan1 Hui Xiong1,2, 1AI Thrust , The Hong Kong University of Science and Technology (Guangzhou), China 2Department of Computer Science and Engineering, HKUST, China {jiechaoguan, xionghui}@hkust-gz.edu.cn
Pseudocode	Yes	Algorithm 1 Hierarchical Bayesian Algorithms for Multi-Task Linear Bandit Setting (...) Algorithm 2 Hierarchical Bayesian Algorithms for Multi-Task Combinatorial Semi-Bandit Setting
Open Source Code	Yes	The source code for reproducing all experimental results of Hier TS and Hier Bayes UCB is provided in the supplementary material.
Open Datasets	No	The paper uses a synthetic problem setup, not a publicly available dataset with specific access information. 'The synthetic problem is deﬁned as follows. In most experiments, we set the number of total tasks as m = 10, the dimension of action space as d = 4, the number of concurrent tasks as L = 5, the number of rounds as n = 200m/L. We focus on the ﬁnite action space with \|A\| = 10, and each action is sampled uniformly from [ 0.5, 0.5]d.'
Dataset Splits	No	The paper describes a synthetic experimental setting with parameters for simulations (e.g., 'number of rounds as n = 200m/L'), but it does not specify explicit training, validation, or test dataset splits in terms of percentages or sample counts for an external dataset.
Hardware Specification	Yes	We run all bandit algorithms on a platform with 8 NVIDIA RTX 6000 GPUs and 2 AMD EPYC 7543 Processors. Each GPU has 48G memory, and each CPU has 64 cores.
Software Dependencies	Yes	The CUDA version is 12.1, the Python version 3.7.16, the matplotlib version 3.5.3, and the tensorﬂow version 1.15.
Experiment Setup	Yes	In most experiments, we set the number of total tasks as m = 10, the dimension of action space as d = 4, the number of concurrent tasks as L = 5, the number of rounds as n = 200m/L. We focus on the ﬁnite action space with \|A\| = 10, and each action is sampled uniformly from [ 0.5, 0.5]d. In hierarchical Bayesian model, we set the hyper-prior as zero-mean isotropic Gaussian distribution N(µq, Σq) = N(0, Σq), where Σq = σ2 q Id; and set the task variance Σ0 = σ2 0Id. Unless otherwise stated, we set σq = 1, σ0 = 0.1, σ2 = 0.5 for each task in most experiments.