Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization

Authors: Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental BEACON achieves impressive empirical results. It not only outperforms existing decentralized algorithms significantly, but indeed has a comparable performance as the centralized benchmark, hence corroborating the theoretical analysis. Remarkably, BEACON with the linear reward function generally achieves 6 improvement over the state-of-the-art METC (Boursier et al., 2020).
Researcher Affiliation Academia Chengshuai Shi University of Virginia cs7ync@virginia.edu Wei Xiong The Hong Kong University of Science and Technology wxiongae@connect.ust.hk Cong Shen University of Virginia cong@virginia.edu Jing Yang The Pennsylvania State University yangjing@psu.edu
Pseudocode Yes Algorithm 1 BEACON: Leader
Open Source Code No The paper does not provide any statement or link indicating that source code for the described methodology is publicly available.
Open Datasets No All results are averaged over 100 experiments and the utilities follow mutually independent Bernoulli distributions. ...To validate whether this significant gain of BEACON over METC is representative, we plot in Fig. 2(b) the histogram of regrets with 100 randomly generated instances still with M = 5, K = 5, T = 106. Expected arm utilities are uniformly sampled from [0, 1] in each instance. The paper uses randomly generated instances rather than a pre-existing public dataset and does not provide access information for the generated data.
Dataset Splits No The paper discusses performing experiments on randomly generated instances and averages results over 100 experiments. However, it does not explicitly provide details about train/validation/test splits, specific percentages, or sample counts for data partitioning to reproduce the experiments.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, cloud instances) used to run the experiments.
Software Dependencies No The paper does not specify any software dependencies (e.g., libraries, frameworks, or solvers) with version numbers that would be needed to replicate the experiments.
Experiment Setup No The paper mentions game instance parameters like M=5, K=5 or M=6, K=8, and states that utilities follow Bernoulli distributions. However, it does not provide specific experimental setup details such as hyperparameters (e.g., learning rates, batch sizes, epochs) or other system-level training configurations for the algorithms themselves.