What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning
Authors: Zhihong Deng, Jing Jiang, Guodong Long, Chengqi Zhang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of the proposed techniques in explaining, detecting, and reducing inequality in reinforcement learning. In this section, we empirically validate the theoretical results of our work through extensive experiments. |
| Researcher Affiliation | Academia | Zhihong Deng , Jing Jiang , Guodong Long and Chengqi Zhang Australian Artificial Intelligence Institute, University of Technology Sydney zhi-hong.deng@student.uts.edu.au, {jing.jiang,guodong.long,chengqi.zhang}@uts.edu.au |
| Pseudocode | Yes | The detailed algorithm is presented in Appendix B. |
| Open Source Code | Yes | We publicly release code at https://github.com/familyld/Insight Fair. |
| Open Datasets | Yes | Specifically, we use the Allocation environment that contains two demographic groups. At each step, some incidents like rat infestations occur in both groups, depending on their current state. The agent must allocate resources effectively to minimize both missed incidents and resources allocated. Groups do not interact with each other. By manipulating the reward function and the transition dynamics, we can generate different environments to simulate scenarios where the environment satisfies or violates dynamics fairness. [D Amour et al., 2020] Alexander D Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, David Sculley, and Yoni Halpern. Fairness is not static: Deeper understanding of long term fairness via simulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 525 534, 2020. |
| Dataset Splits | No | The paper does not explicitly provide details on training, validation, or test dataset splits, percentages, or how data was partitioned for model training and evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'ML-fairness-gym suite' and 'PETS' but does not specify version numbers for these or any other software dependencies, nor does it list specific programming languages with versions. |
| Experiment Setup | No | The paper states that 'The results are averaged over five runs with different random seeds.' but does not provide specific hyperparameter values, training configurations, or other detailed experimental setup parameters. |