Federated Multi-Armed Bandits

Authors: Chengshuai Shi, Cong Shen9603-9611

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments using both synthetic and real-world datasets corroborate the theoretical analysis and provide interesting insight into the proposed algorithms. Numerical simulations on synthetic and real-world datasets demonstrate the effectiveness and efficiency of the proposed algorithms and offer some interesting insights.
Researcher Affiliation Academia Department of Electrical and Computer Engineering, University of Virginia
Pseudocode Yes Algorithm 1 Fed2-UCB: client m; Algorithm 2 Fed2-UCB: central server
Open Source Code No The paper does not provide an explicit statement or a link to open-source code for the described methodology.
Open Datasets Yes The Movie Lens dataset (Cantador, Brusilovsky, and Kuflik 2011) is used for the real-world evaluation as an implementation of recommender system, which has been widely adopted in MAB studies (Oh and Iyengar 2019; Mahadik et al. 2020).
Dataset Splits No The paper describes the datasets used (synthetic and Movie Lens) but does not provide specific training, validation, and test dataset splits or a methodology for generating them.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers needed to replicate the experiments.
Experiment Setup Yes The communication cost is set to be C = 1. A bandit game with K = 10 arms is used to mimic 10 candidate channels, and Gaussian distributions with σ = 0.5 are used to generate local observations of the channel availability. The means of global arms are in the interval [0.7, 0.8] with = 0.02. ... M = 5 clients are involved ... with f(p) = 10 log(T) ... For the approximate model, the same set of global arms is used while the local models are generated by Gaussian distributions with σc = 0.02. Fig. 5 shows that Fed2-UCB with f(p) = 100 and g(p) = 2p ... Under a reduced time horizon T = 104, Fig. 6 provides a finer look at the shape of regret curves of Fed2-UCB ... With a short update period f(p) = 10 ... f(p) = 50 ... The suboptimality gap of the pre-processed data is 0.0053. Using Fed2-UCB and f(p) = 200 with g(p) = 2p...