Fairness and Welfare Quantification for Regret in Multi-Armed Bandits

Authors: Siddharth Barman, Arindam Khan, Arnab Maiti, Ayush Sawarni

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This work develops an algorithm that, given the horizon of play T, achieves a Nash regret of O q , here k denotes the number of arms in the MAB instance. ... We develop an algorithm that achieves Nash regret of ; here, k denotes the number of arms in the bandit instance and T is the given horizon of play (Theorem 1 and Theorem 2).
Researcher Affiliation Academia 1 Indian Institute of Science 2 University of Washington
Pseudocode Yes Algorithm 1: Nash Confidence Bound Algorithm
Open Source Code No No statement providing concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper was found.
Open Datasets No The paper is theoretical and focuses on algorithm design and proofs, not empirical evaluation on specific datasets. Therefore, no information about publicly available training datasets is provided.
Dataset Splits No The paper is theoretical and does not describe empirical experiments with datasets; therefore, no dataset split information is provided.
Hardware Specification No The paper is theoretical and does not describe empirical experiments that would require specific hardware specifications.
Software Dependencies No The paper is theoretical and does not describe empirical experiments that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and focuses on algorithm design and proofs, not empirical experimental setup details like hyperparameters or training configurations.