One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

Authors: Udari Madhushani, Abhimanyu Dubey, Naomi Leonard, Alex Pentland

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our proposed algorithms are straightforward to implement and obtain competitive empirical performance.
Researcher Affiliation Academia Udari Madhushani Princeton University udarim@princeton.edu Abhimanyu Dubey Massachusetts Institute of Technology dubeya@mit.edu Naomi Ehrich Leonard Princeton University naomi@princeton.edu Alex Pentland Massachusetts Institute of Technology pentland@mit.edu
Pseudocode Yes Algorithm 1: RCL-RC: Cooperative Hybrid Arm Elimination
Open Source Code No The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper describes generating data for experiments: "We consider the 10-armed bandit with rewards drawn from Gaussian distributions..." and "We draw delays from a bounded distribution..." and "We draw corruptions uniformly from the range..." but does not refer to using or providing access to a publicly available dataset.
Dataset Splits No The paper describes a multi-armed bandit simulation setup with agents and arms over time, not a dataset with explicit training, validation, and test splits.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies No The paper mentions 'networkx' but does not specify a version number. It does not provide specific version numbers for any other software dependencies.
Experiment Setup Yes We consider the 10-armed bandit with rewards drawn from Gaussian distributions with σk = 1 for each arm, such that µ1 = 1 and µk = 0.5 for k 6= 1, and the number of agents N = 50, where we repeat each experiment 100 times with G selected randomly from different families of random graphs. We set = 1.1 and γ = max{3, d?(G)/2}. We draw delays from a bounded distribution with E[ ] = 10 and max = 50. We draw corruptions uniformly from the range [0, ] for each message, where is increased from 10 3 to 10 2. We use γ = 2.