reproducibilityindex.ai

Achieving Outcome Fairness in Machine Learning Models for Social Decision Problems

Authors: Boli Fang, Miao Jiang, Pei-yi Cheng, Jerry Shen, Yi Fang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on various datasets demonstrate that our fairgroup construction method effectively boosts the fairness in automated decision making, while maintaining high prediction accuracy. 1
Researcher Affiliation	Academia	1Dept of Computer Science, Indiana University 2Dept of Intelligent Systems Engineering, Indiana University 3Sol Price School of Public Policy, University of Southern California 4Dept of Computer Science and Engineering, Santa Clara University {bfang, miajiang,peicheng}@iu.edu, haoxuans@usc.edu, yfang@scu.edu
Pseudocode	No	The paper describes the steps of the algorithm but does not present them in a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code	Yes	1The source code of our experiments is available at https://github.com/miaojiang1987/AI-for-fairness.
Open Datasets	Yes	For our experiments, we have focused on the United States Census American Community Survey data [Bureau, 2017], and considered two separate sub-datasets: the Medicaid dataset and the SNAP dataset.
Dataset Splits	No	For both of the datasets, we have split our data into training and testing sets. The paper mentions training and testing sets but does not explicitly mention a validation set or specific split percentages for validation.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions various machine learning models and algorithms (e.g., decision trees, regression models, random forests, K-median clustering) but does not provide specific version numbers for any software or libraries used.
Experiment Setup	Yes	Once we have selected the protected feature (feature with largest importance score) of income, we group the entire dataset into 5 clusters by K-median clustering [Zhu and Shi, 2015]. The choice of K is determined by the standard choice of cluster numbers yielding the best empirical evaluations. In each cluster, we maintain the same ratio for poverty and non-poverty households by setting the balance as 2/4 between poverty and non-poverty households, and iteratively match points accordingly.