Multi-Player Bandits – a Musical Chairs Approach

Authors: Jonathan Rosenski, Ohad Shamir, Liran Szlak

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present several experiments which validate our theoretical findings. For our experiments, we implemented the DMC algorithm for the dynamic case and the MC algorithm for the static case. For comparison, we implemented the MEGA algorithm of (Avner & Mannor, 2014), which is the current state-of-the-art for our problem setting.
Researcher Affiliation Academia Jonathan Rosenski JONATHAN.ROSENSKI@WEIZMANN.AC.IL Weizmann Institute of Science, Rehovot 7610001, Israel Ohad Shamir OHAD.SHAMIR@WEIZMANN.AC.IL Weizmann Institute of Science, Rehovot 7610001, Israel Liran Szlak LIRAN.SZLAK@WEIZMANN.AC.IL Weizmann Institute of Science, Rehovot 7610001, Israel
Pseudocode Yes Algorithm 1 MC, Algorithm 2 Musical Chairs, Algorithm 3 Dynamic MC
Open Source Code No No explicit statement or link providing access to open-source code for the described methodology was found.
Open Datasets No No concrete access information (link, DOI, repository, or formal citation) for a publicly available or open dataset was provided. The paper describes generating data based on random distributions: 'The mean rewards of the arms are chosen uniformly at random in [0, 1]'.
Dataset Splits No No specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) was provided. The paper describes simulation scenarios without explicit train/validation/test splits.
Hardware Specification No No specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running the experiments were provided.
Software Dependencies No No specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment were provided.
Experiment Setup Yes For the MC and DMC algorithm, we set T0 to be 3000 in all experiments. For the DMC parameter, T1, we use either the theoretically optimal value presented in this work or that value scaled by a small constant (see details below for the specific value in each experiment).