Multi-Agent Learning with Heterogeneous Linear Contextual Bandits
Authors: Anh Do, Thanh Nguyen-Tang, Raman Arora
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we validate our theoretical results with numerical simulations on synthetic data. When the level of dissimilarity is small, H-LINUCB outperforms independent learning. When the level of dissimilarity is high, our simulation shows that blindly using shared data can lead to linear regret, emphasizing the importance of the criterion we propose for when to stop the collaboration. |
| Researcher Affiliation | Academia | Anh Do Johns Hopkins University ado8@jhu.edu Thanh Nguyen-Tang Johns Hopkins University nguyent@cs.jhu.edu Raman Arora Johns Hopkins University arora@cs.jhu.edu |
| Pseudocode | Yes | Algorithm 1 H-LINUCB |
| Open Source Code | Yes | Our code is available here: https://github.com/anhddo/hlin UCB. |
| Open Datasets | Yes | Simulation setup. We generate the ε-MALCB problem for M = 60, d = 30, T = 10000 via the following procedure. We first choose a value of ε in each of the three dissimilarity regimes. Then we create the linear parameters {θm}M m=1 as follows. Let u, {vm}M m=1 be random vectors with unit norm. We set θm = c u + ε 2vm, where c is a constant in the range [0, 1 ε]. This guarantees θm 1 and θi θj ε for any two agents i, j. At each round, for each agent, we create a new decision set with a size of 50, each action is random and normalized to 1. The random noise is sampled from the standard normal distribution, η N(0, 1). |
| Dataset Splits | No | The paper uses synthetic data generated according to a described procedure but does not specify explicit train/validation/test splits. The experiments are run for T rounds, and performance is evaluated over these rounds, typical for bandit problems, rather than using static dataset splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. |
| Software Dependencies | No | The paper mentions a GitHub repository for the code but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, or other libraries). |
| Experiment Setup | Yes | Simulation setup. We generate the ε-MALCB problem for M = 60, d = 30, T = 10000 via the following procedure. We first choose a value of ε in each of the three dissimilarity regimes. Then we create the linear parameters {θm}M m=1 as follows. Let u, {vm}M m=1 be random vectors with unit norm. We set θm = c u + ε 2vm, where c is a constant in the range [0, 1 ε]. This guarantees θm 1 and θi θj ε for any two agents i, j. At each round, for each agent, we create a new decision set with a size of 50, each action is random and normalized to 1. The random noise is sampled from the standard normal distribution, η N(0, 1). We run each experiment 10 times, then report the group regret averaged over the runs and the confidence intervals in Figure 1. |