The price of unfairness in linear bandits with biased feedback
Authors: Solenne Gaucher, Alexandra Carpentier, Christophe Giraud
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we study the problem of fair sequential decision making with biased linear bandit feedback. At each round, a player selects an action described by a covariate and by a sensitive attribute. The perceived reward is a linear combination of the covariates of the chosen action, but the player only observes a biased evaluation of this reward, depending on the sensitive attribute. To characterize the difficulty of this problem, we design a phased elimination algorithm that corrects the unfair evaluations, and establish upper bounds on its regret. We show that the worst-case regret is smaller than O( 1/3 log(T)1/3T 2/3), where is an explicit geometrical constant characterizing the difficulty of bias estimation. We prove lower bounds on the worst-case regret for some sets of actions showing that this rate is tight up to a possible sub-logarithmic factor. We also derive gap-dependent upper bounds on the regret, and matching lower bounds for some problem instance. |
| Researcher Affiliation | Academia | Solenne Gaucher Département de Mathématiques d Orsay, Université Paris-Saclay, Orsay, France solenne.gaucher@math.u-psud.fr Alexandra Carpentier Institut für Mathematik, Universität Potsdam, Potsdam, Germany carpentier@uni-potsdam.de Christophe Giraud Département de Mathématiques d Orsay, Université Paris-Saclay, Orsay, France christophe.giraud@universite-paris-saclay.fr |
| Pseudocode | Yes | Routine 1 G-EXP-ELIM (X, n, ) Routine 2 -EXP-ELIM (X, (X (z), b (z))z2{ 1,1}, b , n, ) Algorithm 3 FAIR PHASED ELIMINATION (sketched) |
| Open Source Code | No | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A] |
| Open Datasets | No | The paper is theoretical and does not conduct experiments involving datasets, thus no dataset availability information is provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments, therefore no dataset splits for validation are mentioned. |
| Hardware Specification | No | The paper is theoretical and does not report on experiments that would require specific hardware. No hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not report on experiments that would require specific software dependencies with version numbers. No such details are provided. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings. |