One More Step Towards Reality: Cooperative Bandits with Imperfect Communication
Authors: Udari Madhushani, Abhimanyu Dubey, Naomi Leonard, Alex Pentland
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed algorithms are straightforward to implement and obtain competitive empirical performance. |
| Researcher Affiliation | Academia | Udari Madhushani Princeton University udarim@princeton.edu Abhimanyu Dubey Massachusetts Institute of Technology dubeya@mit.edu Naomi Ehrich Leonard Princeton University naomi@princeton.edu Alex Pentland Massachusetts Institute of Technology pentland@mit.edu |
| Pseudocode | Yes | Algorithm 1: RCL-RC: Cooperative Hybrid Arm Elimination |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper describes generating data for experiments: "We consider the 10-armed bandit with rewards drawn from Gaussian distributions..." and "We draw delays from a bounded distribution..." and "We draw corruptions uniformly from the range..." but does not refer to using or providing access to a publicly available dataset. |
| Dataset Splits | No | The paper describes a multi-armed bandit simulation setup with agents and arms over time, not a dataset with explicit training, validation, and test splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions 'networkx' but does not specify a version number. It does not provide specific version numbers for any other software dependencies. |
| Experiment Setup | Yes | We consider the 10-armed bandit with rewards drawn from Gaussian distributions with σk = 1 for each arm, such that µ1 = 1 and µk = 0.5 for k 6= 1, and the number of agents N = 50, where we repeat each experiment 100 times with G selected randomly from different families of random graphs. We set = 1.1 and γ = max{3, d?(G)/2}. We draw delays from a bounded distribution with E[ ] = 10 and max = 50. We draw corruptions uniformly from the range [0, ] for each message, where is increased from 10 3 to 10 2. We use γ = 2. |