Achieving Outcome Fairness in Machine Learning Models for Social Decision Problems

Authors: Boli Fang, Miao Jiang, Pei-yi Cheng, Jerry Shen, Yi Fang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various datasets demonstrate that our fairgroup construction method effectively boosts the fairness in automated decision making, while maintaining high prediction accuracy. 1
Researcher Affiliation Academia 1Dept of Computer Science, Indiana University 2Dept of Intelligent Systems Engineering, Indiana University 3Sol Price School of Public Policy, University of Southern California 4Dept of Computer Science and Engineering, Santa Clara University {bfang, miajiang,peicheng}@iu.edu, haoxuans@usc.edu, yfang@scu.edu
Pseudocode No The paper describes the steps of the algorithm but does not present them in a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code Yes 1The source code of our experiments is available at https://github.com/miaojiang1987/AI-for-fairness.
Open Datasets Yes For our experiments, we have focused on the United States Census American Community Survey data [Bureau, 2017], and considered two separate sub-datasets: the Medicaid dataset and the SNAP dataset.
Dataset Splits No For both of the datasets, we have split our data into training and testing sets. The paper mentions training and testing sets but does not explicitly mention a validation set or specific split percentages for validation.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions various machine learning models and algorithms (e.g., decision trees, regression models, random forests, K-median clustering) but does not provide specific version numbers for any software or libraries used.
Experiment Setup Yes Once we have selected the protected feature (feature with largest importance score) of income, we group the entire dataset into 5 clusters by K-median clustering [Zhu and Shi, 2015]. The choice of K is determined by the standard choice of cluster numbers yielding the best empirical evaluations. In each cluster, we maintain the same ratio for poverty and non-poverty households by setting the balance as 2/4 between poverty and non-poverty households, and iteratively match points accordingly.