Fairness in Reinforcement Learning
Authors: Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness. Second, we present an approximate-action fair algorithm (Fair-E3) in Section 4 and prove a polynomial upper bound on the time it requires to achieve near-optimality. |
| Researcher Affiliation | Academia | 1University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth <jabbari, majos, mkearns, jamiemor, aaroth@cis.upenn.edu>. |
| Pseudocode | No | The paper describes the Fair-E3 algorithm informally in text (Section 4.1, 4.2, 4.3), but it does not contain a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statement about providing open-source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | No | The paper is purely theoretical and does not conduct experiments on datasets, thus no information about public dataset availability is provided. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical evaluation with datasets, thus no training/validation/test dataset split information is provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any computational experiments, thus no specific hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe any implementation details that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe empirical experiments, therefore no specific experimental setup details, such as hyperparameters or training configurations, are provided. |