reproducibilityindex.ai

Multi-Agent Learning with Heterogeneous Linear Contextual Bandits

Authors: Anh Do, Thanh Nguyen-Tang, Raman Arora

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we validate our theoretical results with numerical simulations on synthetic data. When the level of dissimilarity is small, H-LINUCB outperforms independent learning. When the level of dissimilarity is high, our simulation shows that blindly using shared data can lead to linear regret, emphasizing the importance of the criterion we propose for when to stop the collaboration.
Researcher Affiliation	Academia	Anh Do Johns Hopkins University ado8@jhu.edu Thanh Nguyen-Tang Johns Hopkins University nguyent@cs.jhu.edu Raman Arora Johns Hopkins University arora@cs.jhu.edu
Pseudocode	Yes	Algorithm 1 H-LINUCB
Open Source Code	Yes	Our code is available here: https://github.com/anhddo/hlin UCB.
Open Datasets	Yes	Simulation setup. We generate the ε-MALCB problem for M = 60, d = 30, T = 10000 via the following procedure. We first choose a value of ε in each of the three dissimilarity regimes. Then we create the linear parameters {θm}M m=1 as follows. Let u, {vm}M m=1 be random vectors with unit norm. We set θm = c u + ε 2vm, where c is a constant in the range [0, 1 ε]. This guarantees θm 1 and θi θj ε for any two agents i, j. At each round, for each agent, we create a new decision set with a size of 50, each action is random and normalized to 1. The random noise is sampled from the standard normal distribution, η N(0, 1).
Dataset Splits	No	The paper uses synthetic data generated according to a described procedure but does not specify explicit train/validation/test splits. The experiments are run for T rounds, and performance is evaluated over these rounds, typical for bandit problems, rather than using static dataset splits.
Hardware Specification	No	The paper does not provide any specific hardware details used for running the experiments.
Software Dependencies	No	The paper mentions a GitHub repository for the code but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, or other libraries).
Experiment Setup	Yes	Simulation setup. We generate the ε-MALCB problem for M = 60, d = 30, T = 10000 via the following procedure. We first choose a value of ε in each of the three dissimilarity regimes. Then we create the linear parameters {θm}M m=1 as follows. Let u, {vm}M m=1 be random vectors with unit norm. We set θm = c u + ε 2vm, where c is a constant in the range [0, 1 ε]. This guarantees θm 1 and θi θj ε for any two agents i, j. At each round, for each agent, we create a new decision set with a size of 50, each action is random and normalized to 1. The random noise is sampled from the standard normal distribution, η N(0, 1). We run each experiment 10 times, then report the group regret averaged over the runs and the confidence intervals in Figure 1.