Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits
Authors: Ronshee Chawla, Daniel Vial, Sanjay Shakkottai, R. Srikant
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Algorithm 1 in the context unaware setting and Algorithm 3 in the partially context aware setting, and verify their insights through synthetic simulations. |
| Researcher Affiliation | Academia | 1Chandra Family Department of Electrical and Computer Engineering, University of Texas, Austin, TX, USA 2Department of Electrical and Computer Engineering, University of Illinois, Urbana-Champaign, IL, USA. |
| Pseudocode | Yes | Algorithm 1 (at agent i)Algorithm 2 Arm RecommendationAlgorithm 3 (at agent i)Algorithm 4 Dividing M most recent unique arm recommendations |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide links to a code repository. |
| Open Datasets | No | The paper uses synthetic simulations where 'Arm means are generated uniformly at random from [0, 1) in Figure 1 and [2, 4) in Figure 2', but does not provide access information (link, DOI, repository, or formal citation) for a public dataset. |
| Dataset Splits | No | The paper conducts synthetic simulations and does not describe any training, validation, or test dataset splits (e.g., percentages or counts) or refer to standard predefined splits. |
| Hardware Specification | No | The paper describes simulation parameters ('K arm means... generated uniformly at random', 'set β = 3', 'UCB parameter α is set to 15') but does not specify any hardware components (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software components with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would be needed to replicate the experiment. |
| Experiment Setup | Yes | We assume an equal number (N/M) of agents learning each bandit, set β = 3 and the size of the sticky set S = K/N in these simulations. The UCB parameter α is set to 15 in Figure 1 and 30 in Figure 2. |