Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Best Arm Identification in Multi-Agent Multi-Armed Bandits
Authors: Filippo Vannella, Alexandre Proutiere, Jaeseong Jeong
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the performance of MF-Ta S numerically using both synthetic and real-world experiments (e.g., to solve the antenna tilt optimization problem in radio communication networks). |
| Researcher Affiliation | Collaboration | 1KTH Royal Institute of Technology, Stockholm, Sweden 2Ericsson, Stockholm, Sweden. Correspondence to: Filippo Vannella <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 FCR, Algorithm 2 MF-Ta S, Algorithm 3 VE, Algorithm 4 BUILD A0 |
| Open Source Code | Yes | Additional experiments are reported in App. J, and the code is available at this link. |
| Open Datasets | No | We run our experiments in a proprietary mobile network simulator in an urban environment. The local expected rewards are selected at random as θi(ai, ai+i) U(0, M), for all i [N] and for some M > 0. |
| Dataset Splits | No | The paper does not specify dataset splits like training, validation, or test sets; it mentions synthetic data generation and a proprietary simulator. |
| Hardware Specification | Yes | The experiments run on a Mac Book Pro 2.6 GHz 6-Core Intel Core i7 processor. We use this setup in all of our experiments. |
| Software Dependencies | No | We implement the solver for the lower bound optimization problems using CVXPY (Diamond & Boyd, 2016), with a MOSEK solver. |
| Experiment Setup | Yes | The exploration threshold is selected as β(δ, t) = log(log(t) + 1)/δ). The elimination order for both VE and FCR is chosen as O = {N, N 1, . . . , 1}. |