Best Arm Identification in Multi-Agent Multi-Armed Bandits

Authors: Filippo Vannella, Alexandre Proutiere, Jaeseong Jeong

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the performance of MF-Ta S numerically using both synthetic and real-world experiments (e.g., to solve the antenna tilt optimization problem in radio communication networks).
Researcher Affiliation Collaboration 1KTH Royal Institute of Technology, Stockholm, Sweden 2Ericsson, Stockholm, Sweden. Correspondence to: Filippo Vannella <vannella@kth.se>.
Pseudocode Yes Algorithm 1 FCR, Algorithm 2 MF-Ta S, Algorithm 3 VE, Algorithm 4 BUILD A0
Open Source Code Yes Additional experiments are reported in App. J, and the code is available at this link.
Open Datasets No We run our experiments in a proprietary mobile network simulator in an urban environment. The local expected rewards are selected at random as θi(ai, ai+i) U(0, M), for all i [N] and for some M > 0.
Dataset Splits No The paper does not specify dataset splits like training, validation, or test sets; it mentions synthetic data generation and a proprietary simulator.
Hardware Specification Yes The experiments run on a Mac Book Pro 2.6 GHz 6-Core Intel Core i7 processor. We use this setup in all of our experiments.
Software Dependencies No We implement the solver for the lower bound optimization problems using CVXPY (Diamond & Boyd, 2016), with a MOSEK solver.
Experiment Setup Yes The exploration threshold is selected as β(δ, t) = log(log(t) + 1)/δ). The elimination order for both VE and FCR is chosen as O = {N, N 1, . . . , 1}.