Counterfactual Fairness by Combining Factual and Counterfactual Predictions

Authors: Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, David I. Inouye

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both synthetic and semi-synthetic datasets demonstrate the validity of our analysis and methods.
Researcher Affiliation Academia Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, David I. Inouye Elmore Family School of Electrical and Computer Engineering Purdue University {zhou1059, liu3351, bai116, jinggao, mkocaoglu, dinouye}@purdue.edu
Pseudocode Yes Algorithm 1 Plug-in Counterfactual Fairness (PCF)
Open Source Code Yes Code can be found in https://github.com/inouye-lab/pcf
Open Datasets Yes In this section, we consider Law School Success dataset [Wightman, 1998] where the sensitive attribute is gender and the target is first-year grade. ... To compute TE, we need access to ground truth counterfactuals. Hence we train a generative model on real dataset to generate semi-synthetic dataset following the method in Zuo et al. [2023].
Dataset Splits No The paper mentions training, testing, and sometimes evaluating on a subset of data (like the Law School dataset), but it does not explicitly provide specific splits like "80/10/10 split" or sample counts for train/validation/test sets. It states "Given a test set Dtest" for metrics but not how this test set is formed in relation to other splits.
Hardware Specification Yes All GPU related experiments are run on RTX A5000.
Software Dependencies No In our synthetic experiments, we mainly use KNN based predictors. We use the default parameters in scikit-learn. All MLP methods uses a structure with hidden layer (20, 20) and Tanh activation. In semi-synthetic experiments, we use MLP methods uses a structure with hidden layer (5, 5) and Tanh activation as this is closer to the ground truth SCM. The paper mentions tools like scikit-learn but does not specify version numbers for any software dependencies.
Experiment Setup Yes All MLP methods uses a structure with hidden layer (20, 20) and Tanh activation. In semi-synthetic experiments, we use MLP methods uses a structure with hidden layer (5, 5) and Tanh activation as this is closer to the ground truth SCM.