Achieving Outcome Fairness in Machine Learning Models for Social Decision Problems
Authors: Boli Fang, Miao Jiang, Pei-yi Cheng, Jerry Shen, Yi Fang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various datasets demonstrate that our fairgroup construction method effectively boosts the fairness in automated decision making, while maintaining high prediction accuracy. 1 |
| Researcher Affiliation | Academia | 1Dept of Computer Science, Indiana University 2Dept of Intelligent Systems Engineering, Indiana University 3Sol Price School of Public Policy, University of Southern California 4Dept of Computer Science and Engineering, Santa Clara University {bfang, miajiang,peicheng}@iu.edu, haoxuans@usc.edu, yfang@scu.edu |
| Pseudocode | No | The paper describes the steps of the algorithm but does not present them in a formally labeled "Pseudocode" or "Algorithm" block. |
| Open Source Code | Yes | 1The source code of our experiments is available at https://github.com/miaojiang1987/AI-for-fairness. |
| Open Datasets | Yes | For our experiments, we have focused on the United States Census American Community Survey data [Bureau, 2017], and considered two separate sub-datasets: the Medicaid dataset and the SNAP dataset. |
| Dataset Splits | No | For both of the datasets, we have split our data into training and testing sets. The paper mentions training and testing sets but does not explicitly mention a validation set or specific split percentages for validation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions various machine learning models and algorithms (e.g., decision trees, regression models, random forests, K-median clustering) but does not provide specific version numbers for any software or libraries used. |
| Experiment Setup | Yes | Once we have selected the protected feature (feature with largest importance score) of income, we group the entire dataset into 5 clusters by K-median clustering [Zhu and Shi, 2015]. The choice of K is determined by the standard choice of cluster numbers yielding the best empirical evaluations. In each cluster, we maintain the same ratio for poverty and non-poverty households by setting the balance as 2/4 between poverty and non-poverty households, and iteratively match points accordingly. |