reproducibilityindex.ai

Fair Algorithms for Multi-Agent Multi-Armed Bandits

Authors: Safwan Hossain, Evi Micha, Nisarg Shah

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We propose a multi-agent variant of the classical multi-armed bandit problem, in which there are N agents and K arms, and pulling an arm generates a (possibly different) stochastic reward for each agent. Unlike the classical multi-armed bandit problem, the goal is not to learn the best arm ; indeed, each agent may perceive a different arm to be the best for her personally. Instead, we seek to learn a fair distribution over the arms. Drawing on a long line of research in economics and computer science, we use the Nash social welfare as our notion of fairness. We design multi-agent variants of three classic multi-armed bandit algorithms and show that they achieve sublinear regret, which is now measured in terms of the lost Nash social welfare. ... Our work is purely theoretical in nature, and we do not anticipate any immediate societal risks.
Researcher Affiliation	Academia	Safwan Hossain Department of Computer Science University of Toronto safwan.hossain@mail.utoronto.ca Evi Micha Department of Computer Science University of Toronto emicha@cs.toronto.edu Nisarg Shah Department of Computer Science University of Toronto nisarg@cs.toronto.edu
Pseudocode	Yes	Algorithm 1: UCB Input: Number of agents N, number of arms K Parameters :Conﬁdence parameter αt for each t N // Pull each arm once for t = 1, . . . , K do pt policy that puts probability 1 on arm t // Pull arm t deterministically end for t = K + 1, . . . do Compute the estimated reward matrix bµt pt arg maxp K NSW(p, bµt) + αt P j [K] pj rt j, where rt j q
Open Source Code	No	The paper states 'Our work is purely theoretical in nature' and does not mention releasing open-source code for the described algorithms or providing any links to code repositories.
Open Datasets	No	The paper is theoretical and does not involve empirical studies using datasets.
Dataset Splits	No	The paper is theoretical and does not involve empirical studies, therefore no training/validation/test splits are discussed.
Hardware Specification	No	The paper is purely theoretical and does not report on experimental setups or hardware used.
Software Dependencies	No	The paper is purely theoretical and does not describe software dependencies with version numbers, as it does not report on experimental implementations.
Experiment Setup	No	The paper is purely theoretical and does not detail experimental setups or hyperparameters.