Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models

Authors: Runzhe Wan, Lin Ge, Rui Song

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The proposed method is further supported by extensive experiments. ... The Bayes regret for Gaussian bandits clearly demonstrates the benefits of information sharing with our algorithm. The proposed method is further supported by extensive experiments. ... 6 Experiments 6.1 Synthetic Experiments 6.2 Movie Lens Experiments
Researcher Affiliation Academia Runzhe Wan Lin Ge Rui Song Department of Statistics North Carolina State University {rwan, lge, rsong}@ncsu.edu
Pseudocode Yes Algorithm 1: MTTS: Multi-task Thompson Sampling ... Algorithm 2: Computationally Efficient Variant of MTTS under the Sequential Setting
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We evaluate the empirical performance on the Movie Lens 1M dataset [24]
Dataset Splits No The paper does not specify distinct training, validation, and test dataset splits with explicit percentages, sample counts, or citations to predefined splits for the Movie Lens dataset, nor does it explicitly mention a validation set. For synthetic experiments, data is generated, not split.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions implementing methods but does not provide specific software names with version numbers (e.g., Python, PyTorch, TensorFlow versions or specific library versions) for reproducibility.
Experiment Setup Yes In the main text, we present results with N = 200, T = 200, K = 8 and d = 15. We set φ(xi, a) = (1T a , φT i,a)T , where 1a is a length-K indicator vector taking value 1 at the a-th entry, and φi,a is sampled from N(0, Id K). The coefficient vector θ is sampled from N(0, d 1Id). For Gaussian bandits, we set σ = 1 and Σ = σ2 1IK, for different values of σ1. For Bernoulli bandits, we vary the precision ψ.