reproducibilityindex.ai

Fairness in Reinforcement Learning

Authors: Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our ﬁrst result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness. Second, we present an approximate-action fair algorithm (Fair-E3) in Section 4 and prove a polynomial upper bound on the time it requires to achieve near-optimality.
Researcher Affiliation	Academia	1University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth <jabbari, majos, mkearns, jamiemor, aaroth@cis.upenn.edu>.
Pseudocode	No	The paper describes the Fair-E3 algorithm informally in text (Section 4.1, 4.2, 4.3), but it does not contain a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	No	The paper is purely theoretical and does not conduct experiments on datasets, thus no information about public dataset availability is provided.
Dataset Splits	No	The paper is theoretical and does not involve empirical evaluation with datasets, thus no training/validation/test dataset split information is provided.
Hardware Specification	No	The paper is theoretical and does not describe any computational experiments, thus no specific hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not describe any implementation details that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe empirical experiments, therefore no specific experimental setup details, such as hyperparameters or training configurations, are provided.