Best Arm Identification in Multi-Agent Multi-Armed Bandits
Authors: Filippo Vannella, Alexandre Proutiere, Jaeseong Jeong
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the performance of MF-Ta S numerically using both synthetic and real-world experiments (e.g., to solve the antenna tilt optimization problem in radio communication networks). |
| Researcher Affiliation | Collaboration | 1KTH Royal Institute of Technology, Stockholm, Sweden 2Ericsson, Stockholm, Sweden. Correspondence to: Filippo Vannella <vannella@kth.se>. |
| Pseudocode | Yes | Algorithm 1 FCR, Algorithm 2 MF-Ta S, Algorithm 3 VE, Algorithm 4 BUILD A0 |
| Open Source Code | Yes | Additional experiments are reported in App. J, and the code is available at this link. |
| Open Datasets | No | We run our experiments in a proprietary mobile network simulator in an urban environment. The local expected rewards are selected at random as θi(ai, ai+i) U(0, M), for all i [N] and for some M > 0. |
| Dataset Splits | No | The paper does not specify dataset splits like training, validation, or test sets; it mentions synthetic data generation and a proprietary simulator. |
| Hardware Specification | Yes | The experiments run on a Mac Book Pro 2.6 GHz 6-Core Intel Core i7 processor. We use this setup in all of our experiments. |
| Software Dependencies | No | We implement the solver for the lower bound optimization problems using CVXPY (Diamond & Boyd, 2016), with a MOSEK solver. |
| Experiment Setup | Yes | The exploration threshold is selected as β(δ, t) = log(log(t) + 1)/δ). The elimination order for both VE and FCR is chosen as O = {N, N 1, . . . , 1}. |