reproducibilityindex.ai

Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits

Authors: Nikolai Karpov, Qin Zhang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we study the collaborative learning model, which concerns the tradeoff between parallelism and communication overhead in multi-agent multi-armed bandits. For regret minimization in multi-armed bandits, we present the ﬁrst set of tradeoffs between the number of rounds of communication among the agents and the regret of the collaborative learning process. Our main result is a lower bound for MAB in the CL model (Theorem 2). We show that for any T-time K-agent collaborative algorithm AK, there is an input I such that if AK runs on I using at most R log(KT ) 2 log log log(KT ) rounds, then AK incurs an expected re-gret of Ω min{K, (KT) 1 R } 1 (I) . We also design an algorithm for batched MAB (Theorem 17). Again via Observation 1, we obtain an algorithm for MAB in the CL model. Our upper bound matches the lower bound up to logarithmic factors in regret.
Researcher Affiliation	Academia	Nikolai Karpov, Qin Zhang Indiana University Bloomington, IN 47405, USA nkarpov@iu.edu, qzhangcs@indiana.edu
Pseudocode	Yes	Algorithm 1: BATCHEDMAB(I, λ, T)
Open Source Code	No	The paper is theoretical and focuses on algorithm design and theoretical bounds. It does not provide any statement about releasing open-source code for the described methodology or a link to a code repository.
Open Datasets	No	The paper is theoretical and defines problem setups conceptually (e.g., 'Bernoulli arms') rather than using specific, publicly available datasets for empirical training. No dataset access information is provided.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments or data splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any specific hardware specifications used for running experiments.
Software Dependencies	No	The paper is theoretical and focuses on algorithm design and theoretical proofs. It does not list any specific software dependencies with version numbers.
Experiment Setup	No	The paper describes a theoretical algorithm (Algorithm 1) but does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings, as it does not report empirical experiments.