reproducibilityindex.ai

The price of unfairness in linear bandits with biased feedback

Authors: Solenne Gaucher, Alexandra Carpentier, Christophe Giraud

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we study the problem of fair sequential decision making with biased linear bandit feedback. At each round, a player selects an action described by a covariate and by a sensitive attribute. The perceived reward is a linear combination of the covariates of the chosen action, but the player only observes a biased evaluation of this reward, depending on the sensitive attribute. To characterize the difﬁculty of this problem, we design a phased elimination algorithm that corrects the unfair evaluations, and establish upper bounds on its regret. We show that the worst-case regret is smaller than O( 1/3 log(T)1/3T 2/3), where is an explicit geometrical constant characterizing the difﬁculty of bias estimation. We prove lower bounds on the worst-case regret for some sets of actions showing that this rate is tight up to a possible sub-logarithmic factor. We also derive gap-dependent upper bounds on the regret, and matching lower bounds for some problem instance.
Researcher Affiliation	Academia	Solenne Gaucher Département de Mathématiques d Orsay, Université Paris-Saclay, Orsay, France solenne.gaucher@math.u-psud.fr Alexandra Carpentier Institut für Mathematik, Universität Potsdam, Potsdam, Germany carpentier@uni-potsdam.de Christophe Giraud Département de Mathématiques d Orsay, Université Paris-Saclay, Orsay, France christophe.giraud@universite-paris-saclay.fr
Pseudocode	Yes	Routine 1 G-EXP-ELIM (X, n, ) Routine 2 -EXP-ELIM (X, (X (z), b (z))z2{ 1,1}, b , n, ) Algorithm 3 FAIR PHASED ELIMINATION (sketched)
Open Source Code	No	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]
Open Datasets	No	The paper is theoretical and does not conduct experiments involving datasets, thus no dataset availability information is provided.
Dataset Splits	No	The paper is theoretical and does not conduct experiments, therefore no dataset splits for validation are mentioned.
Hardware Specification	No	The paper is theoretical and does not report on experiments that would require specific hardware. No hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and does not report on experiments that would require specific software dependencies with version numbers. No such details are provided.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings.