What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning

Authors: Zhihong Deng, Jing Jiang, Guodong Long, Chengqi Zhang

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of the proposed techniques in explaining, detecting, and reducing inequality in reinforcement learning. In this section, we empirically validate the theoretical results of our work through extensive experiments.
Researcher Affiliation Academia Zhihong Deng , Jing Jiang , Guodong Long and Chengqi Zhang Australian Artificial Intelligence Institute, University of Technology Sydney zhi-hong.deng@student.uts.edu.au, {jing.jiang,guodong.long,chengqi.zhang}@uts.edu.au
Pseudocode Yes The detailed algorithm is presented in Appendix B.
Open Source Code Yes We publicly release code at https://github.com/familyld/Insight Fair.
Open Datasets Yes Specifically, we use the Allocation environment that contains two demographic groups. At each step, some incidents like rat infestations occur in both groups, depending on their current state. The agent must allocate resources effectively to minimize both missed incidents and resources allocated. Groups do not interact with each other. By manipulating the reward function and the transition dynamics, we can generate different environments to simulate scenarios where the environment satisfies or violates dynamics fairness. [D Amour et al., 2020] Alexander D Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, David Sculley, and Yoni Halpern. Fairness is not static: Deeper understanding of long term fairness via simulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 525 534, 2020.
Dataset Splits No The paper does not explicitly provide details on training, validation, or test dataset splits, percentages, or how data was partitioned for model training and evaluation.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for running the experiments.
Software Dependencies No The paper mentions the use of 'ML-fairness-gym suite' and 'PETS' but does not specify version numbers for these or any other software dependencies, nor does it list specific programming languages with versions.
Experiment Setup No The paper states that 'The results are averaged over five runs with different random seeds.' but does not provide specific hyperparameter values, training configurations, or other detailed experimental setup parameters.