Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits
Authors: Ronshee Chawla, Daniel Vial, Sanjay Shakkottai, R. Srikant
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Algorithm 1 in the context unaware setting and Algorithm 3 in the partially context aware setting, and verify their insights through synthetic simulations. |
| Researcher Affiliation | Academia | 1Chandra Family Department of Electrical and Computer Engineering, University of Texas, Austin, TX, USA 2Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, IL, USA. |
| Pseudocode | Yes | Algorithm 1 (at agent i)Algorithm 2 Arm RecommendationAlgorithm 3 (at agent i)Algorithm 4 Dividing M most recent unique arm recommendations |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide links to a code repository. |
| Open Datasets | No | The paper uses synthetic simulations where 'Arm means are generated uniformly at random from [0, 1) in Figure 1 and [2, 4) in Figure 2', but does not provide access information (link, DOI, repository, or formal citation) for a public dataset. |
| Dataset Splits | No | The paper conducts synthetic simulations and does not describe any training, validation, or test dataset splits (e.g., percentages or counts) or refer to standard predefined splits. |
| Hardware Specification | No | The paper describes simulation parameters ('K arm means... generated uniformly at random', 'set β = 3', 'UCB parameter α is set to 15') but does not specify any hardware components (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software components with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would be needed to replicate the experiment. |
| Experiment Setup | Yes | We assume an equal number (N/M) of agents learning each bandit, set β = 3 and the size of the sticky set S = K/N in these simulations. The UCB parameter α is set to 15 in Figure 1 and 30 in Figure 2. |